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Chapter 1 


Differential argument marking: Patterns 
of variation 
Alena Witzlack-Makarevich 


University of Kiel 


Ilja A. Serzant 
Leipzig University 


In this introductory article we provide an overview of the range of the phenomena that 
can be referred to as differential argument marking (DAM). We begin with an overview of 
the existing terminology and give a broad definition of the DAM to cover the phenomena 
discussed in the present volume and in the literature under this heading. We then consider 
various types of the phenomenon which have figured prominently in studies of DAM in 
various traditions. First, we differentiate between arguments of the same predicate form 
and arguments of different predicate forms. Within the first type we discuss DAM systems 
triggered by inherent lexical argument properties and the ones triggered by non-inherent, 
discourse-based argument properties, as well as some minor types. It is this first type that 
traditionally constitutes the core of the phenomenon and falls under our narrow definition 
of DAM. The second type of DAM is conditioned by the larger syntactic environment, such 
as clause properties (e.g. main vs. embedded) or properties of the predicate (e.g. its TAM 
characteristics). Then, we also discuss the restrictions that may constrain the occurrence 
of DAM cross-linguistically, other typical features of DAM systems pertaining to the mor- 
phological realization (symmetric vs. asymmetric) or to the degree of optionality of DAM. 
Finally, we provide a brief overview over functional explanations of DAM. 


1 Introduction 


In this introductory article we provide an overview of the range of phenomena that 
can be referred to as differential argument marking (DAM).! We begin this introduction 
with a survey of the existing terminology (this section). We then proceed to consider 
individual aspects of the phenomenon which have figured prominently in studies of 
DAM in various traditions (82 and 83). 


Both authors contributed equally to the writing of this paper. 
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The term differential marking — or to be historically precise, differential object marking 
(abbreviated as DOM) - was first used by Bossong (1982; 1985) in his investigations of the 
phenomenon in Sardinian and New Iranian languages. Somewhat older than this term 
is the term split (as in split ergativity) used in the line of research focusing primarily on 
the differential marking of the agent argument. It has been in use since Silverstein (1976) 
and was popularized by Dixon (1979; 1994). 

Recent years have been marked by a growing interest in differential marking, and asa 
result numerous related terms have been coined to refer to individual roles marked differ- 
entially and particular patterns of differential marking. For example, de Hoop & de Swart 
(2008b) were the first to systematically discuss differential subject marking (DSM). Here, 
the syntactic term subject was understood rather broadly including different kinds of less 
canonical, subject-like arguments. Later, notions covering more specific argument roles 
were introduced: Fauconnier (2011) studies differential agent marking, whereas Haspel- 
math (2007) and Kittilà (2008) explore differential recipient marking or differential goal 
marking, as well as differential theme marking. Another notion that is subsumed under 
DAM is optional ergative marking (cf. among others McGregor 1992; 1998; 2006; 2010; 
Meakins 2009; Gaby 2010). As these and other authors show, in addition to the semantic 
function of encoding agents, ergative case is sometimes also employed to mark focal, 
unexpected or contrastive agent arguments. Finally, Sinnemäki (2014) — observing that 
the term DOM sometimes implies an assumption as to which factors trigger differential 
marking - introduced the term restricted case marking (of the object) to cover all cases 
of differential marking no matter what the respective factors are. Finally, in the tradi- 
tions of the DAM research in individual language families and languages, many more 
language-, role- or marking-specific labels have been used, for instance, prepositional ac- 
cusative in Romance linguistics (e.g. Torrego Salcedo 1999) or bi-absolutive construction 
in the Nakh-Daghestanian languages (e.g. Forker 2012). 

The list of terms provided above makes it clear that research on differential mark- 
ing has focused primarily on arguments. However, differential argument marking can 
be viewed as a subtype of a larger phenomenon which manifests itself in a complex 
interaction between the meaning and function of a particular marking pattern, on the 
one hand, and some properties of the constituents involved - both arguments and ad- 
juncts -, on the other. For instance, the Persian marker -rā is not only used with direct 
object NPs but can follow nearly all kinds of constituents except for subject NPs: one 
finds it marking time-adverbial NPs, objects of prepositions, etc. (cf. various examples 
in Dabir-Moghaddam 1992; for a different example see the discussion of differential time 
adverbial marking in Baltic in SerZant 2016: 141-154). Besides, case marking needs not be 
fully paradigmatic and different cases/adpositions impose different selectional restric- 
tions on the type of nominals they can mark. These restrictions may potentially create 
paradigmatic gaps and differential marking with both arguments and adjuncts. The main 
condition for this is the semantic compatibility between the meaning of a particular 
case/adposition and the nominal (Comrie 1986; Aristar 1997; Creissels & Mounole 2011). 
For example, Aristar (1997) shows that locational cases/adpositions are often less or zero 
marked with place names but require a dedicated suffix with other nouns which are less 
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expected to occur in expressions denoting location. Similarly, animacy is an important 
factor that decreases the likelihood of such cases as instrumental, ablative or locative to 
occur. Hence, highly animate nominals may either not form the locative cases at all or 
require additional marking. In turn, cases/adpositions such as dative or comitative typ- 
ically require animate participants. Having said this, in what follows we will focus on 
differential marking of arguments primarily for reasons of space. 

As is obvious from the plethora of terms listed above, differential marking is a very 
broad notion that covers a wide range of different phenomena. Given that the investiga- 
tions in the present volume are aimed at diachronic processes we cannot a priori focus 
on a subset of cases for something that we treat here as being in flux, thereby leaving out 
phenomena that have the potential to develop into DAM in a more accepted sense (or in 
fact have been attested to undergo this development), as well as those phenomena that 
arguably originate from DAM but exhibit somewhat deviating properties due to later de- 
velopments. For this reason, we keep the definition of DAM fairly broad. We will use the 
term DAM as defined in (1) (drawing on Woolford 2008; lemmolo & Schikowski 2014)? 


(1) Broad definition of DAM: 
Any kind of situation where an argument of a predicate bearing the same 
generalized semantic argument role may be coded in different ways, depending 
on factors other than the argument role itself, and which is not licensed by 
diathesis alternations. 


It follows from this definition that DAM is not restricted to case marking in the broad 
sense (also called dependent marking or flagging) and subsuming both morphological 
case and adposition marking (cf. Haspelmath 2005), but also includes differential agree- 
ment (or head marking or indexing). For example, Iemmolo (2011) has introduced the 
term differential object indexing (DOI) to refer to cases of differential argument mark- 
ing on the verb in contrast to differential case marking on the noun phrase. Whereas 
some linguists think that the two types of differential marking share commonalities (e.g. 
Dalrymple & Nikolaeva 2011: 1-2), others claim that they are different in terms of their 
functions and triggers and may emerge from different diachronic processes (de Hoop & 
de Swart 2008a: 5; Iemmolo & Schikowski 2014). While we agree with this second view, 
we are open to the possibility that there might nevertheless be considerable overlap in 
both diachrony and synchrony. 

To capture the different kinds of DAM systems, we put forward a coordinate system in 
which we highlight the aspects that we consider central for the understanding of DAM 
and give a narrower definition of DAM in (16). Both definitions will be used in the present 
volume and, in fact, there is often a diachronic relationship between them. In what fol- 
lows we will first provide an overview of the properties staking out the phenomenon of 
DAM. 

We begin with an overview of the synchronic variation of the phenomenon and first 
consider the argument-triggered DAM systems (82.1). In particular, we discuss both in- 


?Some authors go even further and consider inverse systems and voice alternations as instances of DAM 
(e.g. de Hoop & de Swart 20082: 1). 
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herent lexical argument properties (82.1.1; 82.1.2) and non-inherent discourse-based ar- 
gument properties (82.1.3) and proceed with the properties of the larger syntactic envi- 
ronment (82.1.5). 82.2 covers DAM cases triggered by various predicate properties. $2.3 
provides a brief summary of the various triggers for DAM. in 82.4, we introduce vari- 
ous restrictions that constrain the occurrence of DAM cross-linguistically. $3 is devoted 
to realization properties of DAM. 83.1 discusses the morphological distinction between 
symmetric vs. asymmetric DAM types. We then contrast different loci of realization of 
DAM: head-marking and dependent-marking (83.2). 83.3 highlights differences in syn- 
tactic (behavioral) properties found with DAM. The distinction between obligatory vs. 
optional is introduced in $3.4. 83.5 provides a brief summary of the factors involved in 
variation. Finally, we discuss a few functional explanations (84) and conclusions (85). 


2 Synchronic variation of DAM 


As defined above, DAM encompasses a range of phenomena sharing the trait of encod- 
ing the same argument role in different ways. However, apart from this shared property 
DAM systems vary from language to language. To allow for the comparison of DAM 
systems and their diachronic development paths, we decompose the phenomenon into 
a number of characteristics which build upon the attested synchronic variation and sug- 
gestions made in the literature on the topic. 

In what follows we introduce two orthogonal distinctions of DAM systems: argument- 
triggered DAM (82.1) vs. predicate-triggered DAM (82.2) and restricted DAM vs. unre- 
stricted DAM (82.4). We begin by considering those DAM systems where the differential 
argument marking may be found with one and the same form of the predicate (hence- 
forth: argument-triggered DAM). For this type of DAM a number of variables are needed 
to account for the attested variation. These are various properties of arguments ($2.1.1- 
82.1.3) and event semantics (82.1.5). In 82.2, we will turn to predicate-triggered DAM 
types, all of which have in common that the differential argument marking depends on 
the actual form of the predicate involved. 


2.44 Argument-triggered DAM 


The properties of arguments can determine DAM in two ways. First, the properties ofthe 
differentially marked argument alone can be responsible for a particular marking. Sec- 
ond, the properties of more than one argument in a clause, i.e. the whole constellation 
of arguments, also referred to as scenario, can determine a particular marking. The first 
type is discussed in 82.1.1-82.1.3 and summarized in 82.1.4, whereas the second type is 
considered in 82.1.5. In both cases, the relevant argument properties include a wide range 
of inherent lexical (semantic and formal), as well as non-inherent, first of all pragmatic 
characteristics of arguments. These subtypes are considered in individual subsections. 
We thus follow Bossong (1991: 159) who first made the distinction between inherent 
and non-inherent properties of the NP in the context of DOM (cf. Sinnemáki 2014: 282, 
who distinguishes between referential and discourse properties). Inherent properties of 
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arguments (semantic and formal) are considered in $2.1.1-82.1.2, non-inherent discourse- 
based properties are discussed in §2.1.3. Finally, we isolate as a subtype of DAM triggers 
cases, where argument properties closely linked to the semantics of the respective event 
determine the type of marking (82.1.6). 


2.1.1 Inherent lexical argument properties 


Many of the properties we cover in this and the following section are often represented 
as integrated into various implicational hierarchies or scales. One of the most cited ver- 
sions of such hierarchies is given in (2). It was introduced by Dixon (1979) as potentiality 
of agency scale and was based on Silverstein's (1976) hierarchy of inherent lexical content. 
A similar hierarchy was independently introduced by Moravesik (1978) as activity scale? 
The hierarchy was widely popularized by Croft (2003: 130) as the extended animacy hi- 
erarchy. Other common versions of the hierarchy include DeLancey's (1981) empathy 
hierarchy in (3), Aissen's (1999) prominence hierarchy given in (4), and indexability hier- 
archy in Bickel & Nichols (2007). 


(2) first person pronoun > second person pronoun > third person pronoun > proper 


nouns > human common noun > animate common noun > inanimate common 
noun (Dixon 1979: 85) 


(3) speech-act-participant (SAP) > 3rd person human > 3rd person > non-human 
animate » inanimate (adapted from DeLancey 1981: 627-628) 


(4) local person > pronoun 3rd > proper noun 3rd > human 3rd > animate 3rd > 
inanimate 3rd (Aissen 1999: 674) 


These and similar complex hierarchies involve a range of distinct dimensions, such as 
e.g. person or animacy (cf. Croft 2003: 130). These dimensions may be more or less rel- 
evant in shaping DAM systems in individual languages (see Aissen 1999 for examples). 
The major reason for the suggestion of extended versions of hierarchies, as in (2) or (3), 
is the fact that individual dimensions are not entirely orthogonal. Personal pronouns are 
not only inherently animate (except for the third person, cf. English it), they are also 
inherently definite and highly accessible referents. Therefore, they are highest ranked 
also on hierarchies based on definiteness (see 82.1.2) and on the accessibility hierarchy 
(cf. Ariel 1988; 2001) or in terms of topic-worthiness (Wierzbicka 1981). On the other 
hand, some authors (e.g. Dahl 2008) argue that complex hierarchies are problematic in 
many respects and should rather be viewed in terms of a combination of different fac- 
tors operating simultaneously and not as one, unidimensional factor. Thus, though first 
and second person referents are always animate, whereas the third person referents can 
be both animate and inanimate, there is no reason to regard animate third person refer- 
ents as less animate than first and second person referents (cf. Comrie 1989: 195). Ana- 
logically, personal pronouns, proper names or definite NPs are not distinct in terms of 


3For a more extensive overview of the history of research on the effects of referential hierarchies on differ- 
ential marking, see Filimonova (2005). 
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definiteness - these NP types are equally definite (cf. von Heusinger & Kaiser 2003: 45). 
Several researchers have proposed to decompose the single complex hierarchy into sev- 
eral layers or sub-hierarchies (cf. Croft 2003: 130; Siewierska 2004: 149). The advantage 
of such multi-layered hierarchies is that their sub-hierarchies are logically independent, 
and each hierarchy may have more or less influence on shaping the grammatical system 
of an individual language (Haude & Witzlack-Makarevich 2016). 

In what follows we first provide an overview of individual dimensions contributing 
to the complex hierarchies discussed above and relevant for DAM and then present a 
few examples. We begin this overview with the inherent lexical argument properties 
which have a semantic component. The relevant dimensions and their levels are listed 
in Table 1.* These are probably the most frequently discussed factors behind DAM and 
examples of their effects on case marking or agreement can be easily found in the lit- 
erature (e.g. Silverstein 1976; Aissen 1999; Dixon 1994). Note that these dimensions are 
still inherently complex in the sense that they can be further decomposed into a range 
of binary features as in Silverstein's (1976) original proposal (e.g. [£animate], [+human], 
[+ego]) or in Bossong 1991: 159). 


Table 1: Inherent semantic argument properties. 


Dimension Example 

Person First & Second person » Third person » (Obviative / Fourth person) 
(cf. Dixon 1979: 85; Croft 2003: 130) 

Animacy Humans > Animate non-humans (animals) > Inanimate (cf. 
Bossong 1991: 159; Silverstein 1976; Aissen 2003) 

Uniqueness Proper nouns » Common nouns (e.g. as part of Croft 2003: 130) 

Discreteness Count nouns » Mass nouns (cf. Bossong 1991: 159) 

Number Singular vs. Plural vs. Dual 


The individual levels in Table 1 are ordered - where possible - in an implicational 
hierarchy. With respect to argument marking these hierarchies are meant to reflect ei- 
ther universal constraints on possible splits in alignment of case and agreement and/or 
the cross-linguistic frequency of actual language types (cf. Croft 2003: 123). For instance, 
according to one reading, the types at the top of the hierarchies tend to show accusative 
alignment, whereas the ones at the bottom of the hierarchy tend to align ergatively (cf. 
Silverstein 1976, see also Bickel et al. 2015 for the testing of the effects of various hierar- 
chies on alignment against a large sample of over 370 case systems worldwide). 

By listing the dimensions individually in Table 1 we do not imply that for each of then 
there exists a DAM system in which a particular property is the only trigger of DAM. 
Rather, in the vast majority of languages these and further dimensions to be introduced 
later interact in an intricate fashion. For instance, we do not know of any language in 


^Some authors rank the first and the second persons, e.g. Dixon (1979: 85) ranks the first person over the 
second person. 
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which number is the only relevant dimension, but there are many synchronic cases in 
which a combination of person and number provides an exact characterization of the 
split in marking, which is particularly common within pronouns (see Bickel et al. 2015 
for examples). Number is also known to play a role in the diachrony of DAM. For in- 
stance, in Old Russian primarily animacy-driven DOM has started out in singulars and 
spread further to plurals. In this language, DOM (genitive vs. zero accusative) is attested 
with singular masculine proper names and human nouns from the earliest original Old 
Russian sources on, i.e. from the 11th c., representing the Common Slavic inheritance. At 
the same time, animacy-driven DOM spread onto plurals during the 13-15th centuries 
and to nouns referring to animals in the 16th c. (inter alia, Krys'ko 1994: 61). The dual 
forms developed animacy-driven DOM from the 12-14th c. (Krys’ko 1994: 98). There is ev- 
idence that the plural forms acquired DOM approximately during the same time period 
as the dual in Old Russian. 

Not all of the properties listed in Table 1 apply to both DSM and DOM to the same 
extent. For instance, animacy is sometimes claimed to be a relevant parameter for DOM, 
while DSM/Differential Agent Marking systems that are organized exclusively along the 
animacy scale are rare (Fauconnier 2011). Fauconnier (2011) demonstrates that indepen- 
dently acting inanimates may pattern with animates with respect to Differential Agent 
Marking, while being distinct from inanimates acting non-independently (via human in- 
stigation). (See also Sinnemäki 2014 on the frequency of animacy as a factor conditioning 
DOM.) 

Finally, animacy may have an effect on the DAM in a less straightforward way. Thus, 
von Heusinger & Kaiser (2007; 2011) and von Heusinger (2008) investigate the impact of 
animacy on the diachronic development of DOM in Spanish. They show that for a partic- 
ular subset of objects, namely for both definite and indefinite human direct objects, the 
preference for a-marking depends among other things on the verb class. If the respec- 
tive verb regularly takes human or animate objects, it tends to use the a-marking on its 
human objects more frequently than the verbs which regularly take inanimate objects. 
This trend is stable across different periods irrespective of the overall preference for the 
a-marking of objects. 


2.12 Morphological argument properties 


Apart from the inherent semantic properties of arguments discussed in 82.1.1, differences 
in argument marking may often be better captured in terms of inherent morphological 
properties of the relevant arguments. The latter include the part-of-speech distinction 
(pronoun vs. noun) and - much less frequently discussed - gender/inflectional-class dis- 
tinctions. These two types of DAM will be discussed in what follows. 

The pronoun vs. noun distinction is one of the most common lines of split in case 
marking worldwide (cf. Bickel et al. 2015). For instance, in Jingulu all pronominal patient- 
like arguments are marked with the accusative suffix -u, as in (5), whereas all nominal 
patients are in the unmarked nominative case, no matter whether they are animate, as 
in (6c) and (6d), human, as in (6d) or definite, as in (6b — 6d): 
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(5) Jingulu (Mirndi; Pensalfini 1997: 102, 160, 247) 


a. Angkurla  larrinka-nga-ju ngank-u. 
NEG understand-1sG-do  2SG-ACC 


‘I didn’t understand you. 

b. Ngiji-ngirri-nyu-nu kunyaku. 
see-1PL.EXCL-20BJ-did 2DU.ACC 
“We saw you two. 

c. jaja-mi  ngarr-u! 

Wait-IRR  18G-ACC 


“Wait for me!” 


(6) Jingulu (Mirndi; Pensalfini 1997: 100, 198, 249, 275) 

a. Ngangarra ngaja-nga-ju. 
wild.rice see-1sc-do 
‘I can see wild rice: 

b. fani madayi-rni ngaja-nya-ju? 
Q cloud.NoM-roc  see-2sc-do 
“Can you see the cloud?” 

c. Wiwimi-darra-rni warlaku  ngaja-ju. 
girl-PL-ERG dog.NoM  see-do 
"Ihe girls see the dog’ 

d. Ngaja-nga-ju  niyi-rnini  nayurni. 
see-15G-do 3SG.GEN-F  Woman.NOM 


‘I can see his wife. 


Differential case marking here is the consequence of a larger phenomenon that consists 
in pronouns patterning differently from nouns when it comes to argument marking. 
First, pronominal case-markers are often phonologically (and etymologically) distinct 
from the nominal ones. As Filimonova (2005) points out, pronouns belong to the most 
archaic parts of the lexicon and might be more stable and resistant to morphological and 
phonological changes than nouns and, hence, preserve the older case markers longer 
than nouns. On the other hand, pronouns often are subject to stronger syntactic con- 
straints. This might also be part of the explanation for why pronouns - especially those 
referring to the speech act participants - represent the most notorious hierarchy offend- 
ers (see examples in Bickel et al. 2015). 

Finally, inherent properties can only be viewed as triggers of DAM but not as its func- 
tion or result since these properties (such as pronouns vs. nouns or animate vs. inanimate 
distinctions) are already coded lexically (Klein & de Swart 2011: 4—5). 

The second group of inherent morphological argument properties which can trigger 
DAM are gender and inflectional classes. For example, in Icelandic, certain noun classes 
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distinguish between nominative and accusative while others do not (Thráinsson 2002: 
153), compare the two examples: 


(7) Icelandic (Indo-European; Thráinsson 2002: 153) 
a. tím-i 'time-NOM.sG' vs. tim-a 'time-Acc.sc' (masculine weak I) 


b. nal ‘needle-Nom.sc’ and ‘needle-acc.sc’ (feminine strong I) 


In other languages, different inflectional classes have different but always overt allo- 
morphs of a marker, as e.g. in Kuuk Thaayorre (Pama-Nyungan, Australia), in which 
there are three ergative alomorphs depending on the conjugation class plus minor pat- 
terns: the ergative is marked either with the suffix -(n)thurr, or with a lexically specified 
suffixed vowel plus the segment /l/ (Gaby 2006: 158-163). 

This type of differences in argument marking is only rarely discussed in the context 
of DAM, probably due to the fact that inflectional class assignments in many languages 
are only partly semantically conditioned (e.g. by the sex of their extensions) and are 
otherwise idiosyncratic and thus do not yield any obvious functional explanations. An 
exception in the case of typological studies is Bickel et al. (2015) and a few discussions 
of DAM in individual languages, e.g. Karatsareas (2011) on Cappadocian Greek. Another 
reason for the neglect of this type of DAM probably results from the fact that many 
studies on DAM, starting with Silverstein (1976), were interested in different alignment 
patterns resulting from DAM and not in DAM yielding identical alignment patterns, as is 
the case in languages which use different overt allomorphs of a marker, such as in Kuuk 
Thaayorre, where the overall alignment pattern does not change despite the difference 
in marking. 

Sometimes differences between inflectional classes might be viewed as a diachronic 
effect of *morphologization" of a previously semantically constrained DAM. Russian 
seems to undergo this process whereby the animacy-driven DOM by the opposition 
of the former accusative case (zero) (stol-g ‘table-acc/Nom’) vs. genitive case (Celovek- 
à 'human-ACC/GEN ) is now becoming just one heterogeneous accusative case with two 
allomorphs depending on the particular noun and, hence, on its inflectional class. The 
allomorphy can be argued for by applying various syntactic and substitution tests. For 
example, Corbett (1991: 165-167) treats animacy in Russian as a sub-gender. 


2.4.3 Non-inherent, discourse-based argument properties 


Apart from the inherent semantic and morphological lexical argument properties dis- 
cussed in $2.1.1-82.1.2 above, a range of further characteristics related to how referents 
are used in discourse are known to interact with DAM. On the one hand, these prop- 
erties include such semantic dimensions as definiteness and specificity; on the other 
hand, they include other categories considered under the umbrella term of INFORMA- 
TION STRUCTURE. 


Definiteness and specificity As the examples of the effect of definiteness and speci- 
ficity on argument marking, in particular, on DOM, are abundant and easy to find, in 
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this section we only briefly introduce this type of DAM. Definiteness and specificity 
are notoriously difficult to define. A common proxy for definiteness is the semantic- 
pragmatic notion of identifiability. Thus, a definite argument is one for which the hearer 
can identify the referent (Lyons 1999: 2-5). In a similar way, Lambrecht (1994) defines 
identifiability as reflecting “a speaker's assessment of whether a discourse representa- 
tion of a particular referent is already stored in the hearer's mind or not" (Lambrecht 
1994: 76). In contrast to definiteness, which depends both on the speaker and the hearer, 
specificity only depends on the speaker; a nominal is specific whenever the speaker has 
a "particular referent in mind" (Lyons 1999: 35)? As the two phenomena of definiteness 
and specificity interact closely, they are frequently integrated into one hierarchy, as in (8) 
(see e.g. Comrie 1986: 94; Croft 2003: 132): 


(8) definite » (indefinite) specific » (indefinite) non-specific 


A recent study by Sinnemáki (2014) investigates the effect of definiteness and speci- 
ficity on DOM and finds that in 71 of 178 languages with DOM in his sample (and in 43 
out of 83 genealogical units) definiteness and/or specificity play a role, though the re- 
spective geographic distribution is somewhat biased: DOM of the languages in the Old 
World (Africa, Europe, and Asia) are more prone to be affected by this feature than the 
languages in Australia, New Guinea and the Americas. 


Information structure The effects of another type of discourse-based properties of ar- 
guments on DAM viz. information structure properties have been noticed already in 
early studies of DAM (e.g. Laca 1987 on Spanish; Bossong 1985) and has become par- 
ticularly prominent in some recent studies on DAM, including McGregor (1998; 2006) 
on differential agent marking, as well as Iemmolo (2010); von Heusinger & Kaiser (2007; 
2011); Escandell-Vidal (2009) and Dalrymple & Nikolaeva (2011) on DOM. In what follows 
we provide an outline of some of the claims. 

Dalrymple & Nikolaeva (2011: 14) claim that many seemingly unpredictable cases of 
variation in DOM can be accounted for by considering information structure, understood 
as that level of sentence grammar where propositions (i.e. conceptual states of affairs) 
are structured in accordance with the information-structure role of sentence elements. 
Specifically, topicality plays a critical role in many cases of DOM, such that the distri- 
bution of the differential marking depends on whether the object is a SECONDARY TOPIC 
or (part of) the focus constituent (Nikolaeva 2001; Dalrymple & Nikolaeva 2011). In this 
line of research, secondary topic is understood as “an element under the scope of the 
pragmatic presupposition such that the utterance is construed to be about the relation 
that holds between it and the primary topic" (Nikolaeva 2001: 2). lemmolo (2010) argues 
against Dalrymple & Nikolaeva's (2011) suggestion and claims that DOM is primarily re- 
lated to primary topics and special marking is reserved for pragmatically atypical objects, 
which are primary (or aboutness) topics. 


5For an overview of the history of research on specificity and other approaches to specificity, see von Heu- 
singer (2011). 
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Apart from topicality, focality also figures as a demarcation line for DAM, particularly 
in cases of a variant of differential agent marking called optional ergativity. For instance, 
in Central (Lhasa) Tibetan (Sino-Tibetan) unmarked agent arguments are associated with 
unmarked information distribution, whereas the use of the ergative marker yields a read- 
ing with emphasis (focus) on either the identity or the agency of the agent (cf. Tournadre 
1991). While it is somewhat difficult to define and operationalize the notion of emphasis 
or focality, related notions of unexpectedness, surprise or unpredictability of the referent 
might be better terms in describing individual DAM systems. For instance, Schikowski 
(2013) uses the term unexpectedness in addition to various other inherent (animacy) and 
context-dependent (specificity) properties to explain DOM in Nepali. In Warrwa (Nyul- 
nyulan, Western Australia), NPs are marked with the focal ergative marker -nma, as 
in (9b), when they are “unexpected, unpredictable, or surprising in terms of their iden- 
tity and agentivity" (McGregor 2006: 399), otherwise they are marked with a different 
ergative exponent, viz. -na, as in in (9a). To account for the distribution of the two mark- 
ers in continuous stretches of discourse, McGregor (1998: 516) postulates the Expected 
Actor Principle: "Ihe episode protagonist is — once it has been established — the ex- 
pected (and unmarked) Actor of each foregrounded narrative clause of the episode; any 
other Actor is unexpected". 


(9 Warrwa (Nyulnyulan, Western Australia; McGregor 2006: 402) 


a. nyinka jurrb — g-ji-na-yina kinya  wanyji kwiina 
this jump 3minNom-say-psT-3minoBL this later big 
iri ka-na-ngka-ndi-e g-ji-na, kinya-na 
woman  1minNOM-TR-FUT-get-3minacc  3minNoM-say-PST this-ERG 
wuba, 
small 


“The little one jumped at her then, at the big woman, and tried to get her’ 


b. kinya kwüna-nma iri marlu laj g-ji-na-g 
this — big-fERG woman not throw 3minNom-say-PsT-3minAcc 
kinya  wuba, laj, marlu laj g-ji-na-g, 
this little throw not throw 3minNom-say-PsT-3minAcc 


‘But no, the big woman threw the little man away’ 


To summarize, the information-structure roles that are typically coded by DAM are 
foci with S and A arguments and topics with P arguments. Rarely also the status of P 
arguments as focal or non-focal triggers DOM (e.g. in Yukaghir, isolate; Maslova 2003; 
2008), while topicality-triggered differential A marking seems unattested. This asymme- 
try may be explained by the findings of Maslova (2003) and Dalrymple & Nikolaeva 
(2011), who show that in the languages they considered P is common both as focus and 
topic, while A’s predominantly occur as topics. For instance, P’s are 65% topics in Tun- 
dra Yukaghir and 60% topics in Ostyak while they are respectively 35% foci in Tundra 
Yukaghir and 40% foci in Ostyak (Maslova 2003: 182; Dalrymple & Nikolaeva 2011: 167). 


11 


Alena Witzlack-Makarevich & Ilja A. SerZant 


In turn, of all nominal foci of Maslova's Yukaghir corpus 97% are P foci and less than 1% 
are A foci (Maslova 2003: 182; 2008: 796). 


2.14 Argument-triggered DAM: a summary 


The clean typology of argument effects on DAM presented above is an idealization: In 
many languages argument-triggered DAM systems are conditioned by an intricate com- 
bination of both inherent and non-inherent properties. For example, the DOM in Spanish 
is primarily conditioned by animacy (an inherent property) but inanimates allow for vari- 
ation depending on factors such as definiteness and specificity. Moreover, while definites 
are always marked, indefinites again allow for variation of marking where topicality, se- 
mantic verb class, preverbal position may favor the marking (von Heusinger & Kaiser 
2007; 2011). According to Escandell-Vidal (2009), pronominal objects in Balearic Catalan 
are always case-marked by accusative, i.e. an inherent part-of-speech characteristic of 
the argument is at work, whereas with non-pronominal objects case marking is partly 
determined by topicality. The DOM of Biblical Hebrew is conditioned by a highly com- 
plex set of factors from different domains of grammar, including alongside animacy and 
definiteness, modality (volitionals) and polarity (under negation) of the verb, preverbal 
position of the object NP, presence of the reflexive possessor, etc. (Bekins 2012: 173). 


2.1.5 Properties of scenario and global vs. local DAM systems 


In §2.1.1-§2.1.4 we discussed how various inherent and discourse-based properties of ar- 
guments affect argument marking. This type of DAM conditioned by argument-internal 
properties is sometimes referred to as LOCAL (Silverstein 1976: 178; Malchukov 2008: 213, 
passim). However, not only the properties of differentially marked arguments themselves 
might be relevant: In some languages, argument marking is sensitive to the properties 
of other arguments of the same clause, i.e. to the nature of the co-arguments. In other 
words, not only one argument on its own, but the whole configuration of who is acting 
on whom can shape DAM systems. This type of DAM is labeled GLOBAL by Silverstein 
(1976: 178), because the assignment of case-marking is regulated on the global level of 
the event involving all arguments. Following Bickel (1995; 2011) and Zúñiga (2006), such 
argument configurations will be referred to as scenarios in what follows. Within flag- 
ging the effects of scenarios are not common, but they are well known in the domain 
of indexing under the notion of HIERARCHICAL AGREEMENT (cf. Siewierska 2003; 2004: 
51-56). 

Effects of scenarios on case marking can be illustrated with object marking in Agua- 
runa. In this language, the object argument is marked in one of two ways. First, it can 
be in the unmarked nominative, such as the nominal argument yawaá ‘dog.Nom’ in (10a) 
and the pronominal arguments ni “3snomM' in (10b) or hutii “Ipnom' in (10c): 


(10) Aguaruna (Jivaroan, Peru; Overall 2007: 155, 443, 444) 
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a. Yawaá ii-nau maa-tfa-ma-ka-umi? 
dog.NoM  1PL-POSS kill HIAF-NEG-REC.PST-INT-2SgPST 
“Have you killed our dog?” 
b. Ni iima-ta. 
3SG.NOM  carry.PFV-IMP 
"You(sg.) carry him!” 
c. Hutii ainau-ti atumi wai-hatu-ina-humi-i. 


IPL.NOM  PL-SAP 2PL.NOM  See-1PL.OBJ-PL.IPFV-2PL-DECL 


"You(pl.) see us? 


Second, objects can be marked with the accusative case suffix -na, such as biika-na ‘beans- 
ACC’ in (11a), ii-na 1PL-ACC' in (11b) or ami-na ‘2sG-acc’ in (11c): 


(11) Aguaruna (Jivaroan, Peru; Overall 2007: 146, 326, 444) 

a. Ima biika-na-ki yu-a-ma-ha- i. 
INTENS  bean-ACC-RESTR  eat-HIAF-REC.PST-1SG-DECL 
‘I only ate beans? 

b. Ni ii-na antu-hu-tama-ka-aha-tata-wa-i. 
3SG.NOM  IPL-ACC  listen-APPL-1PL.OBJ-INTENS-PL-FUT-3-DECL 
*He will listen to us? 

c. Hutii a-ina-u-ti daka-sa-tata-hami-i ami-na. 
IPLNOM  COP-PLIPPFV-REL-SAP  Wait-ATT-FUT-1SG»28G.OBJ-DECL 2SG-ACC 


"We will wait for you’ 


As (10c) and (11b) demonstrate, an object with identical referential properties (first per- 
son plural pronoun) can be either in the nominative or in the accusative case. Thus, 
the internal properties of arguments cannot be the trigger of DOM in Aguaruna. The 
information-structural properties are not relevant either. Instead, the distribution of the 
two types of object marking is determined by the configuration of the referential prop- 
erties of both transitive arguments - the A and the P — and is summarized as follows: 


Object NPs are marked with the accusative suffix -na, with some exceptions, that 
are conditioned by the relative positions of subject and object on the following 
person hierarchy: 


1sg > Zeg > 1p1/2pl > 3 


First person singular and third person subjects trigger accusative case marking on 
any object NP, but second person singular, second person plural, and first person 
plural only trigger marking on higher-ranked object NPs. (Overall 2007: 168-169) 


Similar cases have been reported from other languages. Thus, Malchukov (2008: 213) 
states that differently from Hindi, where DOM is purely locally constrained, the related 
language Kashmiri has globally conditioned DOM: “P takes an object (AcC/DAT) case if A 
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is lower than P on the Animacy/Person Hierarchy” (Malchukov 2008: 213 relying on Wali 
& Koul 1997: 155). Thus, as Malchukov (2008) points out, the global vs. local distinction 
may be observed even with DAM systems that have the same origin. Not only inherent 
argument properties of more than one argument involved in a scenario can trigger DAM, 
as in the examples above, but also non-inherent discourse-related argument properties 
of the whole scenario are known to trigger DAM. The well-known examples include 
proximate vs. obviative case marking in the Algonquian languages (see, for instance, 
Dahlstrom 1986 on Plains Cree). 


2.1.6 Properties dependent on event semantics 


In some languages DAM is not directly triggered by the inherent or discourse-related 
properties of arguments or a constellation of several arguments, as discussed in $2.1.1- 
82.1.5, but rather by the way these arguments are involved in an event. The relevant 
aspects include - among others - volitionality/control or agentivity and affectedness 
(for discussions, see Næss 2004; McGregor 2006; Fauconnier 2012: 4). DAM is used in this 
context to differentiate between various degrees of transitivity in several ways. While 
manipulating the degrees of agentivity/control/volitionality is typically done by means 
of differential agent (or subject) marking, various degrees of affectedness (pertaining to 
P arguments) and resultativity (pertaining to the verbal domain) may be expressed via 
DOM. This division of labor is, of course, expected, because such semantic entailments 
as volitionality/agentivity or affectedness are associated with the A and the P arguments, 
respectively. In what follows, we provide an overview of these two subtypes. 

Tsova-Tush provides an example of differential S marking triggered by volitionality: 
according to Holisky (1987), when the argument is volitionally involved and/or in con- 
trol of the event the S argument appears in the ergative, as in (12a), whereas when the 
involvement of the argument lacks volition or control, it appears in the nominative case, 
as in (12b): 


(12) Tsova-Tush (Nakh-Daghestanian; Georgia; Holisky 1987: 105) 
a. (As)  vuiz-n-as. 
lsERG fall-AoR-1sERG 
‘I fell. (It was my own fault that I fell down.) 
b. (So) voz-en-sO. 
1lsNoM fall-AoR-1sNOM 


‘I fell down, by accident: 


The difference between (12a) and (12b) may also be approached in slightly different terms. 
Discussing the data from Latvian and Lithuanian, illustrated in (13), SerZant (2013) sug- 
gests that some cases of DAM might be better explained by operating with the property 
of the control over the pre-stage of an event. This account is somewhat different from vo- 
litionality and control, because the subject referent does not have control over the very 
event of falling in (12) or getting cold in (13) below. At the same time, the more agentive 
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marking implies that the subject referent had the opportunity to prevent the situation 
from coming about, but failed to exercise control at the stage before the event took place. 
Thus, in Lithuanian, both (13a) and (13b) are grammatical in isolation, but given the con- 
text provided by the sentence with the doctor, only (13a) is allowed: 


(13) Lithuanian (Baltic, Indo-European; Serzant 2013: 289) 
Gydytojas ant skaudancio piršto uždėjo ledų, ir ^ po desimties 


doctor on aching finger put ice and after ten 
minuciy 

minute 

a. man  pirst-as visai atsal-o 


IDAT finger-NoM fully get.cold-3PsT 
b. “as pirst-q visai  atsal-a-u 
Lnom finger-Acc fully  get.cold-PsT-15G 


"Ihe doctor put ice on [my] aching finger and after 10 minutes my finger got 
cold (lit. to me the finger got cold)’ [Elicited] 


In both examples (13a) and (13b), there is no direct control over the event itself on the 
part of the experiencer (to denote full control, the respective causative form of the verb 
“to get cold” has to be used in Lithuanian). 

The other subtype of DAM conditioned by event semantics, viz. affectedness and re- 
sultativity-related DAM, has often been discussed in relation to particular areas and fami- 
lies, most prominently with respect to the total vs. partitive alternation in the Finnic and 
some neighboring Indo-European languages. Languages of the eastern Circum-Baltic 
area (Dahl 8 Koptjevskaja-Tamm 2001) show a remarkable degree of productivity of 
this type of DAM (SerZant 2015): 


(14) Lithuanian (Baltic, Indo-European; own knowledge) 
a. Jis iš-gėrė vanden-j. 
he  rELIC-drink.3PsT — water-Acc.sc 
“He drank (up) (the/some) water. 
b. Jis iš-gėrė vanden-s. 
he  rELIC-drink.3PsT water-GEN.SG 


“He drank (*the/some) water 


The verb “to drink” subcategorizes for an accusative object in Lithuanian, as in (14a), 
which is the default option in this language and may have both definite and indefinite 
(weak/*some”) interpretation, since this language does not have grammaticalized arti- 
cles and bare NPs are generally ambiguous regarding definiteness. However, the regu- 
lar accusative marking may be overridden by the genitive case, as in (14b), where the 
exhaustive or definite reading is no longer available (SerZant 2014). The genitive op- 
tion induces the indefinite-quantification reading in (14b) which, in turn, is related to 
non-specificity. Furthermore, the indefinite-quantity reading renders the verbal phrase 
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in (14b) atelic (non-resultative in the Finnish tradition, cf. Huumo 2010), the whole event 
of “drinking water’ becomes an activity predicate in contrast to the accomplishment in- 
terpretation in (14a). While this effect is found mostly with verbs taking the incremental 
theme (Dowty 1991) in Lithuanian (SerZant 2014), Finnic languages allow basically any 
accomplishment verb to acquire an activity interpretation by means of this type of DOM, 
cf. the verb 'to open' in (15) taking a non-incremental theme (cf. Kiparsky 1998; Huumo 
2010): 


(15) Finnish (Finnic, Finno-Ugric; Kiparsky 1998: 273) 
a. Hán  avasi ikkunan. 
he  open.3sG.pSsTr  window.Acc.sc 


'He opened the window: 


b. Hän avasi ikkunaa. 
he open.3sG.PST window.PART.SG 


(i) He was opening the window? 

(ii) He opened the window (partly); 

(iii) He opened the window for a while: 

(iv) “He opened the window again and again’ 

Crucially, all four readings in (15b) imply a construal of an event in the past that is not 
committal as to the achievement of an inherent end point (the door is closed). In turn, 
only (15a) with accusative marking of the object indicates that the inherent end point 
of the process of “window opening' has been achieved. At the same time, in contrast 
to (14b), there is no weak quantification of the object referent — only the verbal action 
is quantified while the object is affected holistically. Note that there is no relation to 
viewpoint (or even progressive) aspect here, as is sometimes assumed in the literature 
(see the discussion in SerZant 2015). The non-resultativity (or only partial result) of the 
event in (15b), of course, entails that the object referent has not been affected to the 
extent that it has been in (15a). 


2.1.7 Argument-triggered DAM: a summary 


$2.1 considers only those cases of DAM where argument properties function as trigger, 
while the form of the predicate remains the same. This type has been in the focus of the 
study of DAM since its very beginning and arguably represents the consensus examples 
of DAM (cf. Bossong 1985; 1991). We follow this tradition and consider this type of DAM 
as a more central one. The following is thus our narrow definition of DAM: 


The Finnish accusative case is highly syncretic: it is homonymous with the genitive in the singular and with 
the nominative in the plural and has dedicated morphology only with personal pronouns (Karlsson 1999: 
100-101). This is why it is sometimes (somewhat misleadingly) referred to as the genitive in the traditional 
linguistic literature on Finnish. 
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(16) Narrow definition of DAM: 
Any kind of situation where an argument of a predicate bearing the same 
generalized semantic role may be coded in different ways, depending on factors 
other than the argument role itself and/or the clausal properties of the predicate 
such as polarity, TAM, embeddedness, etc. 


2.2 Predicate-triggered DAM 


We now turn to the discussion of the other major type of DAM, namely, PREDICATE- 
TRIGGERED DAM. The cases of DAM to be discussed in this section involve a broader 
understanding of the phenomenon according to the definition in (1) but not according 
to the definition in (16), which requires one and the same form of the predicate. In this 
type of DAM, different - though paradigmatically related — forms of the predicate re- 
quire differential marking of its argument and neither inherent nor discourse-related 
properties of arguments play any role. Nevertheless, we think that such DAM systems 
are of no lesser interest than the systems discussed in $2.1 and may be related to them 
diachronically. 


2.2.1 Clause-type-based differential marking 


A very common, but not very frequently discussed kind of DAM is the one in which a 
particular kind of argument marking is found in one type of clause, whereas in some 
other type of clause the relevant argument is marked differently (cf. “main” versus "sub- 
ordinate" clause split in Dixon 1994: 101 or "split according to construction" in McGregor 
2009: 492). This type of DAM can be illustrated by the comparison of the main clause 
with different types of dependent clauses in Maithili. In the main clause, the sole ar- 
gument of one-argument clauses and the more agent-like arguments of two-argument 
clauses are in the nominative, as in (17a) and (17b) respectively: 


(17  Maithili (Indo-European; India, Nepal; Bickel & Yadava 2000: 346, 347) 


a O has-l-aith. 
3hREM.NOM  laugh-PsT-3hNOM 
“He(hrem) laughed: 
b. O okra cah-ait ch-aith. 


3hremM 3nhremM.DAT like-IPFV.PTCP AUx-3hNOM 
“S/he(hrem) likes him/her(nh.REM): 


However, in various types of dependent clauses, for instance in converbial clauses, as 
in (18a), and infinitival clauses as in (18b), these arguments are in the dative case: 
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(18) Maithili (Indo-European; India, Nepal; Bickel & Yádava 2000: 353, 358) 


a. [Hamra (“*ham) ghar  aib-ke] pita-ji khusi 
1DAT INOM home come-cvB father-hnom happy 
he-t-ah. 


be(come)-FUT-3hNOM 
“When I come home, father will be happy. 

b. [Ram-ke | ("Ràm) sut-b-ak lel] ham  yahi  tham-sá 
Ram-Dar Ram.NoM sleep-INF:OBL-GEN for 1NOM here  place-ABL 
uthi-ge-l-aúh. 
rise-TEL-PST-1NOM 


‘I got up from this place in order for Ram to (be able to) sleep. 


Note that differential marking is never possible with one and the same form of the predi- 
cate. Instead, the two types of marking are in complementary distribution as determined 
by the matrix vs. embedded status of the predicate. 


2.2.2 TAM-based differential marking 


Tense, aspect, and mood of the clause present an often discussed trigger of DAM, in 
particular in case of differential agent marking, when discussing so-called split ergativity 
(cf. Comrie 1978; Dixon 1994: 97-101; de Hoop & Malchukov 2007). The distribution of 
case markers in Georgian illustrates this type of DAM. In the present, the agent argument 
appears in the nominative case, e.g. deda ‘mother.Nom’ in (19a). In the aorist, the agent 
argument appears in the narrative case (sometimes also called ergative), e.g. deda-m 
*mother-NARR' in (19b): 


(19) Georgian (Kartvelian; Georgia; Harris 1981: 42) 


a. Deda bans tavis $vil-s. 
motherNoM  she.bathes.him.PRs  selfGEN  child-DAT 


"Ihe mother is bathing her child: 


b. Deda-m dabana tavis-i $vil-i. 
mother-NARR she.bathed.him.aor  selÉGEN-NoM  child-NoM 


"Ihe mother bathed her child: 


A number of functional explanations and predictions about possible systems of mark- 
ing have been proposed with respect to the effects of tense and aspect properties of 
the clause (see Dixon 1994: 97-101; DeLancey 1981; 1982). For instance, Dixon (1994: 99) 
predicts that if a language shows differential agent marking conditioned by tense or 
aspect, the ergative marking pattern is always found either in the past tense or in the 
perfective aspect. Such functional explanations of alleged correlations of marking and 
TAM are sometimes presented as textbook knowledge (cf. Song 2001: 174). However, they 
are not unproblematic, as discussed in Creissels (2008) and Witzlack-Makarevich (2011: 
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143-144). One of the problems lies in the following: The languages frequently used to 
illustrate effects of the tense-aspect properties of the clause on DAM include a number 
of Indo-Aryan and Iranian languages (e.g. Dixon 1994: 100; de Hoop 8 Malchukov 2007). 
However, although tense-aspect values of the clause might superficially seem to condi- 
tion a particular argument marking in these languages, the distribution of case markers 
is actually determined by certain morphological verb forms (for instance, a special par- 
ticiple or a converb) - and not by TAM as such - and this distribution has an etymological 
motivation (for examples, see Witzlack-Makarevich 2011: 144). 


2.2.3 Polarity-based differential marking 


Polarity of the clause is another predicate-related feature that has long been known to 
interact with argument marking (cf. Dixon 1994: 101). Its effects can be illustrated with 
the Finnish examples in (20). Whereas in affirmative clauses the P argument can appear 
either in the accusative or partitive case, as in (20a), in negative clauses only the partitive 
case marking of the P argument is grammatical, as in (20b): 


(20) Finnish (Uralic; Finland; Sulkala & Karjalainen 1992: 115) 
a. Sóin omena-n | /  omena-a. 
eat.lsiprv  apple-Acc /  apple-PART 
'T ate/was eating an apple: 
b. En syónyt omena-a. 
NEG-1s eat-2PTCP  apple-PART 


‘I didn't eat/was not eating an apple: 


2.2.4 Differential marking and marking of information structure with verbal 


morphology 


While information-structure-driven DAM systems mostly represent cases of DAM in 
the narrow sense, as defined in (16), individual information-structural configurations 
may also require different forms of the predicate, e.g. in Somali (Saeed 1987). Similarly, 
in Arbor, the form of the predicate in (21a) is different from the one in (21b): the topical, 
nominative subject (21a) takes the predicate with the auxiliary ?íy while the focal subject 
(21b) does not allow the auxiliary: 


(21) Arbore (Cushitic, Ethiopia; Hayward 1984: 113) 


a. Farawé ri-y zahate 
horse.F.NOM  PVS-35G  die.3sG.F 
“(A) horse died. 

b. Farawa zéhe 


horse.F.PRED  died.3sG.M 


“(A) HORSE died? (Capitals signify the narrow focus) 
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2.3 Summary of DAM triggers 


Sections $2.1-82.2 cover the entire range of DAM triggers. We identify two major types 
of DAM systems. On the one hand, we distinguish argument-triggered DAM systems 
with no direct dependency on the predicate form. Such systems can be triggered by 
various argument properties and event semantics and are in accordance with both our 
narrow definition in (16) and broad definition in (1). On the other hand, there are a whole 
range of DAM systems where the same argument role is marked differently in different 
subparadigms of the predicate. Table 2 summarizes this typology and provides references 
to the respective examples. 


Table 2: DAM systems according to the trigger 


DAM trigger type DAM trigger Examples 
properties of inherent l Jingulu (5), (6) 
the argument properties anımacy; 
(local DAM) person; 

discreteness, 

part of speech, 


inflection class 


non-inherent 
definiteness, Warrwa (9) 


specificity, 
topicality, 
focality 


properties 


same predicate ^ properties of inherent animacy, 
form the whole properties person, Aguaruna (10), 
scenario (global discreteness, (11) 

DAM) part of speech, 


inflection class 


non-inherent definiteness, 3PROX > 30BV 
properties specificity, (not in the text) 
topicality, 
focality 


event semantics 
affectedness, Tsova-Tush (12) 


control over the 
event 


different TAM, polarity, 


predicate forms clause type, etc. Maithili (17), 


(18); Georgian 
(19); Finnish 
(20) 
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2.4 The scope of DAM: restricted and unrestricted DAM systems 


Whereas in some languages DAM seems to apply throughout the whole language system, 
in many languages its range is restricted in various ways, e.g. to particular predicates 
or individual clause types or to particular inflectional classes. Thus, one can distinguish 
between restricted DAM systems (to be illustrated in this section) and apparently unre- 
stricted systems (the examples given in $2.1-82.2, though admittedly we are not always 
certain whether DAM indeed applies without any restrictions in these languages). 

In Latvian, the nominative-accusative split in patient marking is restricted to a very 
limited domain, namely, to the debitive construction denoting necessity. The construc- 


tion is marked by an auxiliary (optional in the present tense) and the prefix jā- on the 
verb, as in (22): 


(22) Latvian (Baltic, Indo-European; personal knowledge) 
a. Tev (ir) jà-ciena mani/"es. 
you.DAT (AUX.PRS.3) DEB-respect LACC/'LNOM 
b. Tev (ir) jà-ciena vins/mate/valsts. 
you.DAT (AUX.PRS.3) DEB-respect he.NomM/mother.Nom/state.NOM 
a. You have to be respectful towards me (Acc). 


b. ‘You have to be respectful towards him (Nom) / [your] mother (Nom) / [the] 
country (Nom). [Constructed example] 


In this construction, the patient argument realized with speech-act-participant personal 
and reflexive pronouns is obligatorily marked with the accusative case, while other NP 
types are marked with the nominative case in the standard language. Elsewhere, Latvian 
does not show any DAM. The debitive construction in (22) is thus the only domain in 
Latvian within which one finds DAM. 

Another type ofa cross-linguistically recurrent domain for DAM is subordinate clauses. 
For instance, in Turkish, the domain for the differential subject marking is the nominal- 
ized subordinate clause in which the subject must either bear the nominative case — 
which is a morphological zero — or be marked overtly by the genitive case. In the for- 
mer case the subject has a generic, non-specific interpretation, as in (23b), in the latter 
case, it has a specific indefinite interpretation, as in (23a) (Comrie 1986: 95; Kornfilt 2008: 
83-84): 


(23) Turkish (Turkic; Kornfilt 2008: 83-84) 
a. [Kóy-ü bir haydut-un ` bas-tig-in]-1 duy-du-m. 
village-acc a robber-GEN  raid-FN-3sG-ACC  hear-PST-1SG 
‘T heard that a (certain) robber raided the village” (specific) 
b. [Kóy-ü haydut  bas-ti8-1n]-1 duy-du-m. 
village-acc robber  raid-rN-3sG-Acc  hear-PST-15G 


‘I heard that robbers raided the village’ (non-specific, generic) 
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Crucially, the nominative vs. genitive differential subject marking is found only in the 
subordinate clauses, while the main clauses in Turkish do not allow this type of DAM. 
Note that the distinction between the subordinated vs. main clause is not the trigger 
for the DAM here, in contrast to the cases discussed in $2.1.1. In this case, the DAM is 
triggered by the properties of the respective argument - specific vs. non-specific, as dis- 
cussed in §2.1.3.The only difference to the other similar examples is that the distribution 
of DAM is restricted to subordinate clauses. 

In addition to syntactically restricted domains, as in (22) and (23), DAM systems may 
also be restricted lexically. Thus, the range of DAM may be limited by a particular class 
of verbs - motivated semantically or otherwise. For instance, a small number of one- 
argument predicates in Hindi/Urdu allow for differential marking of its sole argument 
conditioned by volitionality, e.g. bhók- ‘bark’, khás- ‘cough’, chik- ‘sneeze’, hás- ‘laugh’, 
etc. (see Davison 1999 for an exhaustive list). This is illustrated in (24): whereas in (24a) 
the sole argument is in the unmarked nominative case and the event of coughing is 
understood as being unintentional, in (24b) the sole argument is in the ergative case to 
reflect the intentional nature of the coughing event: 


(24) Hindi-Urdu (Indo-Aryan; India, Pakistan; Tuite et al. 1985: 264) 


a. Ram khas-a. 
Ram.NoM  cough-PRF.M 


‘Ram coughed’ 


b. Ram-ne  khás-a. 
Ram-ERG cough-PRF.M 
‘Ram coughed (purposefully). 

We discussed similar cases in §2.1.6 under properties dependent on event semantics. 
The major difference between these examples and the examples in §2.1.6 lies in the fact 
that the intentionality-based DAM in Hindi/Urdu does not apply to every sole argument, 
but, its domain is limited to a very small set of verbs. 

To summarize, the range of DAM can be restricted in various ways by the properties of 
the predicate: by various verbal grammatical categories (such as tense, aspect or mood), 
by the syntactic position (e.g. embedded vs. matrix) or by lexical restrictions (particular 
verb classes only). The categories which restrict the range of DAM are often similar to 
those discussed in §2.2, but their effect on DAM is different: whereas in restricted sys- 
tems discussed in this section we find DAM triggered mostly by the familiar inherent or 
discourse-based properties of arguments but limited to particular contexts, e.g. to par- 
ticular types of clauses, the predicate-based DAM systems in §2.2 are directly triggered 
by a particular form of the predicate. Note that the restricted argument-triggered DAM 
systems still adhere to the narrow definition of DAM in (16) alongside the unrestricted 
argument-triggered ones. Another way to put it is as follows: if one knows that the DAM 
system is restricted, one can identify the domain where one finds alternating argument 
marking. However, to predict what kind of marking an argument takes, one still has to 
consider the triggers of DAM. The cross-tabulation of the scope variable of DAM system 
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and the familiar trigger variable yields the four subtypes of DAM systems summarized 
in Table 3: 


Table 3: Typological variation of DAM systems 


Trigger 

argument predicate 
properties properties 

unrestricted unrestricted unrestricted 
argument- predicate-/ 
triggered clause-triggered 
DAM DAM 

Scope 

restricted restricted restricted 
argument- predicate-/ 
triggered clause-triggered 
DAM DAM 


3 Morphological and syntactic properties of DAM 


In this section we provide a survey of the variation in DAM related to its morphological 
and syntactic properties. We first discuss the morphological dichotomy between sym- 
metric and asymmetric DAM systems (83.1) and then proceed to the locus of marking 
and give a short overview of the research on differential flagging in contrast to differen- 
tial indexing (83.2). In 83.3 we briefly consider the syntactic properties of DAM. Finally, 
$3.4 touches upon the issues of obligatoriness of DAM. 


3.1 Symmetric vs. asymmetric DAM 


From the beginning of the research on DOM it has generally been assumed that DOM 
yields a binary opposition based on markedness: certain NP types are marked in terms 
of both prominence (animacy, definiteness, etc.) and morphological encoding while oth- 
ers are unmarked, i.e., are non-prominent and morphologically unmarked (inter alia, 
Bossong 1985; 1991; but also Aissen 2003). In other words, semantic markedness is mir- 
rored by the morphological markedness or AsYMMETRIC encoding: X vs. zero. Many DOM 
systems are of this type, e.g. the DOM of Spanish or Persian. For example, Spanish con- 
trasts animate specific objects to all others by marking the former but not the latter with 
the preposition a. 

Recently, however, also SYMMETRIC DAM systems - i.e. systems where both alterna- 
tives receive overt morphological marking - have become the focus of attention in sev- 
eral studies (e.g. de Hoop & Malchukov 2008; Iemmolo 2013b). Some researchers have ar- 
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gued that symmetric and asymmetric DAM systems are regulated by different principles 
(Dalrymple & Nikolaeva 2011: 19; Abraham & Leiss 2012; Iemmolo 2013b). For instance, 
Iemmolo's (2013b) study shows that symmetric DOM systems respond to parameters re- 
lated to the overall semantics of the event, e.g. polarity and quantification, affectedness 
or boundedness (aspectuality), whereas asymmetric systems reflect various participant 
properties, most prominently its information-structure role, animacy, referentiality, etc. 
(similarly Abraham & Leiss 2012: 320). 

While functional correlations between prominence and morphological realization of 
DOM like those put forward by Iemmolo (2013b) do indeed find some cross-linguistic sup- 
port, there are a number of counterexamples. For instance, the DOM found in Kolyma 
Yukaghir (Yukaghir, isolate) is symmetric: it requires accusative marking -gele/-kele for 
definite nouns and the instrumental case ending -le for indefinite nouns with third per- 
son A arguments (Maslova 2003: 93). Functionally, this type of DOM is very much remi- 
niscent of the asymmetric DOM in Biblical (and modern) Hebrew. The latter is also con- 
ditioned by definiteness but, in contrast to Kolyma Yukaghir, is morphologically asym- 
metric as it requires the preposition “et with definite NPs and disallows it with indefinite 
NPs. Counterexamples are found with differential agent marking as well: for instance, 
Warrwa (Kimberley, Western Australia; McGregor 2006) has alternations between two 
different ergatives and is thus an instance of symmetric DAM by definition. However, 
in contrast to the claims e.g. in Iemmolo (2013b), this system is solely conditioned by the 
properties of the A argument itself (such as expectedness) and is not related to verbal 
semantics. 

The aforementioned claim about the correlations of symmetrically realized DAMs 
with event interpretation, on the one hand, and asymmetrically realized DAMs corre- 
lating with participant interpretation, on the other, is too strong also for the following 
reason. The opposition between an overt vs. zero marker is only possible if there is no 
general ban on zeros in the particular domain of a language. For example, the opposition 
between accusative and nominative object marking in the Latvian debitive construction 
is functionally dependent - somewhat similarly to the Spanish DOM - on factors such 
as animacy and accessibility but the morphological realization here is the one between 
one overt marking (nominative, e.g. -s) vs. another overt marking (accusative, e.g. -u) 
simply because Latvian disallows zero markers for any case. For this Latvian system it 
is difficult to determine which option is morphologically (more) marked and which one 
is unmarked or less marked” and, crucially, whether more prominent participants (ani- 
mates and more accessible referents) or the less prominent participants (inanimates and 
less accessible) are more coded. 

While Latvian disallows zeros in all its declensional paradigms, other languages pre- 
clude zeros only in a particular (sub)paradigm: typically, the plural and pronominal 
paradigms in fusional declensions do not contain zeros. For example, in Russian, all DOM 
types are symmetric in the plural (but not in the singular) because there is a dedicated 
plural marker -y/-i for the nominative. Even the textbook example of Spanish does not 


"But see, for instance, Keine & Müller (2008) for using not only the length of markers, but also their phono- 
logical properties, such as sonority, to determine phonological markedness. 


24 


1 Differential argument marking: Patterns of variation 


fully fit the pattern X vs. zero when it comes to pronouns, cf. a mí ‘ACC 1sG.ACC' vs. me 
'15G. ACC". Pronouns are often morphologically (suppletive) portmanteau words combin- 
ing both the referential and case-marking morphemes. It is therefore often difficult to 
distinguish between symmetric vs. asymmetric DAM in these cases. 

Rarely are there DOM systems which are asymmetric but where their asymmetry is 
reverse to what is expected because it is the morphologically marked member that is less 
prominent while the zero-marked one is more prominent. For example, the DOM based 
on the opposition between the partitive use of the genitive in Russian is a case in point. 
Here, the less prominent NP is always marked by the partitive genitive with dedicated 
morphological coding. In turn, the accusative case has no dedicated marking for a large 
number of inanimate (and some animate) NPs: 


(25) Russian (Slavic, Indo-European; personal knowledge) 
a. ja vypil konjak-g. 
LNoM drink.sG.pst cognac-sG.acc 
“IT drank up the cognac’ 
b. Ja vypil konjak-a. 
LNoM  drinksc.PsT cognac-sG.GEN 


‘I drank some/*the cognac’ 


The DOM found in (25a)-(25b) is asymmetric by definition. However, it is the semanti- 
cally more prominent NP in (25a) that is unmarked as opposed to (25b). 

To conclude, there are three ways of how prominence correlates with morphologi- 
cal markedness: (i) the prominent meaning is coded with more material than the non- 
prominent (e.g. the Spanish DOM,, (ii) both the prominent and the non-prominent mean- 
ings are similarly coded (e.g. the Latvian debitive's DOM, Serzant & Taperte 2016), and 
(iii) the less prominent meaning is coded with more material than the more prominent 
(cf. 25 above). However, these types are not distributed equally cross-linguistically. Type 
(iii) is rarer than type (i). According to Sinnemáki (2014: 304), in the asymmetric DOM 
systems conditioned by topicality, it is the topical object that receives overt marking in 
all cases. In turn, when it comes to the symmetric type (ii), the correlations mentioned 
in Iemmolo (2013b) do not seem to represent a strong bias. 


3.2 Differential flagging vs. differential indexing 


Differential marking of arguments may be realized as head- or as dependent-marking - 
a difference that is largely constrained by the strategy the language uses to mark core ar- 
guments (i.e. indexing only, indexing and flagging or flagging only, Nichols 1986). Thus, 
among others, Dalrymple & Nikolaeva (2011) treat both as different aspects of the same 
phenomenon. At the same time, indexing and flagging are often claimed to have different 
functions not only synchronically but also at earlier historical stages (cf. Croft 1988: 167- 
168). While agreement or indexing is *a topic related phenomenon" as Givón (1976: 185) 
puts it (cf. also Kibrik 2011), flagging is not related to topichood or information-structure 
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in general, but rather to semantic argument roles and various dependency relations be- 
tween a head and its dependent (cf. lemmolo 2013a). Semantic roles and various depen- 
dency relations constitute the most frequent function of cases (cf. Blake 1994). At the 
same time, dependent marking can and does sometimes end up being employed for prag- 
matic rather than semantic purposes, as with the optional ergative marking illustrated 
in 82.1.3, where one of the ergative markers is associated with continuous topichood, as 
in (9a), while the other occurs with a certain degree of contrast, as in (9b). 

Iemmolo (forthcoming) is an important attempt to delineate the distinction between 
differential object marking (DOM) or rather differential case marking, on the one hand, 
and differential object indexing (DOT) on the other. lemmolo claims that the main distinc- 
tion between the two is that DOI is related to topic continuity whereas DOM is employed 
to encode topic discontinuity. This also naturally follows from the fact that independent 
argument expressions (such as full NPs) are more related to topic discontinuities while 
verb affixes or bound pronouns are typically employed for expected referents such as 
continuous topics. It might thus be the case that the effects found in Iemmolo (forth- 
coming) are due to the distinction between different referential expressions, namely, 
independent versus bound expressions. 


3.3 Syntactic properties of DAM 


In the previous sections we have discussed morphological properties of DAM systems. 
Yet, the syntactic or behavioral properties (to use the term from Keenan 1976) of argu- 
ments in general may be heavily constrained by the morphological marking involved - 
an issue that has been notoriously neglected in the discussion of various DAM systems, 
as emphasized by Dalrymple & Nikolaeva (2011: 17, 140-141). It is tacitly assumed - and 
perhaps correctly for many but not all instances of DAM - that concomitant to a shift in 
marking of an argument, the syntactic properties of that argument do not change. How- 
ever, there are many instances in which this is not the case and differential function leads 
not only to differential marking but also to different syntactic properties, as Dalrymple 
& Nikolaeva (2011: 140-168) extensively argue for languages such as Ostyak, Mongolian, 
Chatino and Hindi. For example, marked and unmarked objects in the DOI of Ostyak 
exhibit asymmetries in syntactic behavioral properties related to reference control in 
nominalized dependent clauses, ability to topicalize the possessor, etc. where the marked 
object is more of a direct object than the unmarked (Dalrymple & Nikolaeva 2011: 17). To 
account for the differences in the syntactic properties Dalrymple & Nikolaeva (2011: 141) 
suggest two cross-linguistic categories (within the LFG framework, drawing on Butt & 
King 1996, but see already Bossong 1991: 158): the grammatically marked, topical object 
OBJ and the non-topical, unmarked object OBJg - a distinction that was originally in- 
troduced for objects of ditransitive verbs but was extended to monotransitive objects in 
Butt & King (1996).* Consider Table 4 from Dalrymple & Nikolaeva (2011: 141): 

While OBJ represents the morphologically marked, discursively salient, topical objects, 


5Note that Butt & King (1996) use the labels OBJ and OBJg in exactly reverse functions than adopted by 
Dalrymple & Nikolaeva (2011). 
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Table 4: Marked and unmarked patient/theme objects (according to Dalrymple 
8 Nikolaeva 2011: 141) 


OBJ OBJo 
Marking Yes No 
Information-structure role Topic Non-topic 
Properties of core grammatical functions Yes No 


the extreme of the opposite case of OBJg would be incorporated objects, e.g. in some East- 
ern Cushitic languages (as discussed in Sasse 1984). For example, the accusative-marked 
objects but not the unmarked objects in Khalkha Mongolian (all definite NPs and some in- 
definite NPs) may be combined with the topical particle ni (whose distribution is syntacti- 
cally governed) and be fronted (Dalrymple & Nikolaeva 2011: 153-154). Another example 
of differences in the syntactic properties is the Russian partitive-accusative DOM: while 
(26a) can easily be passivized, as in (26c), there is no passive counterpart in Standard 
Russian like (26d) that would match the meaning in (26b) inducing weak quantification 
of the object referent: 


(26) Russian (Slavic, Indo-European; personal knowledge) 
a. Ja vy-pil sok. 
LNOM  drink.Psr.M.sG  juice.ACC.SG/NOM.SG 
‘I drank (up) the/some juice’ [Elicited] 
b. ja vypil sok-a. 
LNOM  drink.Psr.M.sG  juice.GEN.sG 
' drank some juice? [Elicited] 


c. Sok byl vypit. 
juice.NOM.SG.M  AUX.PST.SG.M  drink.PST.PASS.M.sG 


“The juice was drunk: [Elicited] 
d. *Sok-a byl-o vypito. 
juice-GEN.SG.M  AUX.PST.SG.N drink.PsT.PASS.N.SG 


[Intended meaning] 'Some juice was drunk? 


Ideally, according to the definition of DAM in (1) and (16), there should be no change 
in the syntactic behavior for an alternation to qualify as DAM. In case of a former dis- 
location, there should be no resumptive pronoun and, more generally, no other factors 
that would rather suggest extra-clausal status of the marked option. 


3.4 Obligatory vs. optional DAM 


de Hoop & Malchukov (2007) distinguish between fluid DAM and split DAM. The former 
refers to constellations in which an argument in one and the same proposition may 
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take both marking options depending on pragmatics and context. In turn, the latter is 
found when the differential marking is conditioned by inherent properties of an NP. 
Indeed, systems of DAM vary in terms of the degree of obligatoriness of a particular 
marking. Whereas in some DAM systems a particular marking applies in predictable 
and consistent fashion with certain types of NPs or in certain grammatical contexts, 
other systems seem to be more flexible (cf. McGregor's (2009) "split" case marking on 
the one hand, and “optional” case marking on the other). Thus, de Swart (2006) reports 
that definiteness may but need not be marked on objects in Hindi. It is only if the speaker 
commits himself to the definite interpretation that it is marked by case. Obligatoriness 
also implies that the alternative option is equally committal. To summarize, the principles 
conditioning DAM may be fully (i) obligatory (splits), (ii) obligatory-optional (split-fluid) 
(similar to Type 3/mixed type in Dalrymple & Nikolaeva's 2011 typology) and fully (iii) 
optional (fluid). Note that — in contrast to de Hoop & Malchukov (2007) and Klein & 
de Swart (2011) - we do not attribute particular semantic domains such as definiteness 
or specificity to the fluid type since there are DAM systems in which the distinction 
between definite and everything else or specific and everything else is rigid. For example, 
the definite NPs must be marked in Modern Hebrew in terms of a fairly rigid rule, thus 
yielding a split. The three types are summarized and illustrated below. 


i Splits (obligatory complementary distribution) are common both with argument- 
triggered DAM, e.g. in the case of differential marking of nouns vs. pronouns, as in 
Jingulu in (5), and with predicate-triggered DAM, such as cases of split ergativity 
where the form of the predicate determines the marking of the argument, as in the 
Georgian examples in (19). 


ii Fluid DAM works solely according to probabilistic rules, as e.g. the DSM restricted 
to negated predicates in Russian (see e.g. Timberlake 2004: 300-311 and the refer- 
ences therein). 


iii Finally, split-fluid is a DAM system which has a combination of both splitting and 
fluid contexts, i.e. contexts that obligatorily require a particular marking (splits) 
and contexts that allow for some optionality. In most of the cases, optionality is 
subordinate to splits. For example, the DOM in Persian has rigid rule for definite 
NPs which must be marked, hence, a definite-indefinite split. In turn, the realm of 
indefinites is conditioned by various degrees of individuation (Lazard 1992: 183- 
185), not exclusively by topicality (pace Dalrymple & Nikolaeva 2011: 107-113). Fi- 
nally, Kannada (Dravidian) has an animate vs. inanimate split where animates 
must be marked while inanimates are either marked or unmarked depending on 
various additional factors (Lidz 2006). 


While splits are defined in terms of rigid and simple rules, optionality is highly complex 
and involves a number of often competing motivations. For example, in an argument- 
triggered DAM such as Spanish DOM, different lexical verbs may considerably alter the 
preferences for DOM (von Heusinger & Kaiser 2007). In the argument-triggered DOM of 
the Latvian debitive, the preferences for Acc vs. NoM marking of non-pronominal NPs 
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are also dependent on the lexical verb but not exclusively so and other factors such as the 
linear position (preverbal vs. postverbal) also play an important role. In the argument- 
driven DOM of Khalkha Mongolian, definite NPs (nouns, pronouns, proper names) are 
obligatorily marked, while weak indefinite (semantically incorporated) bare NPs are obli- 
gatorily unmarked (yielding a split). In turn, the indefinite NPs modified by the indefinite 
article neg are optional and tendentiously constrained by factors such as discourse per- 
sistence (whether or not the referent will be talked about in the following discourse), 
animacy, affectedness, incremental relation with the verb, specificity, etc. (Guntsetseg 
2008: 67). 

While splits typically revolve around inherent properties, this need not always be the 
case. Non-inherent properties may also - albeit more rarely - yield splits. For example, 
Modern Hebrew requires all definite objects to carry the DOM marker et (Danon 2001), 
thus splitting all NP types of Hebrew into definite and indefinite ones.’ 


3.5 Summary 


So far we have outlined various DAM systems and their properties. In Section 81, we gave 
a broad definition of DAM (1) which we recapitulate here for convenience: the term DAM 
broadly refers to any kind of the situation where an argument of a predicate bearing the 
same generalized semantic role may be coded in different ways, depending on factors 
other than the argument role itself, and which is not licensed by diathesis alternations 
(similarly to the way it is defined in Woolford 2008, Iemmolo & Schikowski 2014). This 
definition encompasses both argument-triggered and predicate-triggered DAM systems. 

However, it has to be acknowledged that the consensus examples are all argument- 
triggered DAM, e.g. the DOM in Spanish, for which we have provided the narrower 
definition. In turn, predicate-triggered DAM systems are quite different in many respects, 
as is summarized in Table 3. Here, DAM alternations are complementarily licensed by 
two distinct forms of the predicate (e.g. past vs. present) and/or by two distinct syntactic 
positions of the predicate (embedded vs. main) - both situations do not immediately 
concern NP-internal properties, scenario or event semantics. The latter are crucial for 
the argument-triggered DAM. To capture these differences, we have provided also the 
narrow definition of DAM in (16) above, recapitulated here for convenience: 


(16) Narrow definition of DAM: 
Any kind of situation where an argument of a predicate bearing the same gener- 
alized semantic role may be coded in different ways, depending on factors other 
than the argument role itself and/or the clausal properties of the predicate such as 
polarity, TAM, embeddedness, etc. 


Having said this, different predicate forms expressing, for example, different aspectual 
properties (such as perfective vs. imperfective) are indeed interrelated with such factors 


“Klein & de Swart (2011: 5) assume that fluid vs. split is “always” correlated with function (“result”) vs. 
triggers. 
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as event semantics, but, crucially, only indirectly (e.g. in terms of Hopper & Thompson 
1980). In diachronic terms, predicate-triggered DAM systems may develop into argument- 
triggered ones, which suggests that these two types are not totally distinct. To capture 
potential diachronic and synchronic relations, we have introduced the distinction be- 
tween the broad definition of DAM and the narrow definition. 


4 Functional explanations for DAM 


In this section we will briefly survey a few common explanations of DAM. These ex- 
planations are directly linked to the understanding of what functions morphological 
marking generally serves, in particular, to the functions of case marking. Note, however, 
that these explanations primarily concern the NP-triggered and not for the predicate- 
triggered DAM type, i.e. only the DAM systems that satisfy the narrow definition in (16). 
The two most frequently mentioned functions of case marking here are the distinguish- 
ing (also called discriminatory or disambiguating) function and the identifying (also 
called highlighting, indexing or coding) function (cf. Dixon 1979; 1994; Mallinson & Blake 
1981; Comrie 1989; Song 2001; de Hoop & Malchukov 2008; Siewierska & Bakker 2009; 
Dalrymple & Nikolaeva 2011: 3-8). The distinguishing function of case marking serves 
the purposes of disambiguation of the argument roles in clauses with two or more ar- 
guments. Case marking fulfills the identifying function in that it codes the semantic 
relationship that the argument bears to its verb. In what follows, these two functions are 
presented in further details and linked to particular configurations of argument marking. 

In the identifying approach to the function of DAM, the presence of a marker on an 
argument is independent of the relationship between the arguments of a clause. Instead, 
a particular marker is viewed as a device to highlight more fine-grained distinctions of 
the same semantic role (volitional vs. non-volitional agents, affected vs. non-affected 
patients, controlling vs. non-controlling experiencers, etc.) or various properties of the 
argument itself (e.g. in Hopper & Thompson 1980; Dalrymple & Nikolaeva 2011). For 
instance, for Naess 2004: 1206, the relevant property triggering overt object marking is 
affectedness of the argument. Affectedness is, in turn, defined by employing two other 
concepts: the concept of part-whole relations and of salience. In terms of part-whole 
relations, an entity of which only a subpart is affected is generally less affected than one 
affected as a whole. The concept of salience relies on the assumption that some types of 
effects are more easily perceptible and of greater interest to humans than others (Naess 
2004: 1202). 

Recently, Sinnemáki (2014) has claimed that neither animacy nor definiteness are the 
universal factors conditioning DOM. Thus, Sinnemáki (2014: 295) argues that "there is a 
crosslinguistic dispreference for object case marking to be driven by animacy” His study 
shows that only 47 (= 39%) of genealogical units in his sample had an animacy-effect as 
opposed to 61% of genealogical units in which animacy was not the conditioning factor. 
Analogically, there were 34% (43 genealogical units) affected by definiteness (with an 
areal bias for the Old World) as opposed to 66% (83 genealogical units) which were not. 
Both factors are found to condition DOM in 58% (70 genealogical units) as opposed to 
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42% (51 genealogical units) which are conditioned by some other factors (Sinnemáki 2014: 
296). However, problematic in Sinnemáki's (2014) account is that he did not only consider 
argument-triggered DOM systems for which the predictions mentioned above were de- 
signed but also predicate-triggered DOM systems such as those conditioned by split erga- 
tivity. Moreover, crucially, the DOM systems in 42% of genealogical units are not con- 
ditioned by one single factor but instead by a variety of factors, including tense/aspect, 
singular vs. plural, gender, etc. (cf. the ones listed in Sinnemáki 2014: 284-285). Notably, 
the strengths of each of these are not even remotely similar to either animacy (39%) or 
definiteness (34%), let alone animacy and definiteness together. 

In turn, Dalrymple & Nikolaeva (2011) claim that DOM is primarily motivated by the 
information structure. According to them, DOM is used to highlight "similarities be- 
tween subjects and topical objects" (Dalrymple & Nikolaeva 2011: 3-8) and to delineate 
topical objects and (generally topical) subjects as primary arguments from other, less 
canonical arguments. Dalrymple & Nikolaeva (2011) make an important claim based on 
corpus frequencies that objects are as likely to be topics as foci or parts of foci and that 
focus, therefore, is not the most typical information-structural role of an object, as in 
previous accounts. 

The distinguishing function of case marking always operates together with the two 
more general principles responsible for coding asymmetries: economy and markedness 
(c£. de Swart 2006; 2007; de Hoop & Malchukov 2008). In particular, the principle of 
economy requires arguments to be unmarked to reduce the speaker's efforts. In turn, 
the distinguishing function forces the speaker to mark at least one of the arguments to 
achieve their distinguishability from each other, although the choice of the argument to 
be marked is not arbitrary and is predicted by markedness: the most marked combination 
of the filler and the syntactic slot of the verb's arguments will have a longer morpholog- 
ical marking. Markedness here is based on the intuition of what represents the most 
natural monotransitive clause (e.g. the most frequent clause type in actual discourse) in 
terms of its arguments. Comrie (1989) summarizes this intuition as follows: 


”[...], the most natural kind of transitive construction is one where the A is high in 
animacy and definiteness, and the P is lower in animacy and definiteness; and any 
deviation from this pattern leads to a more marked construction" (Comrie 1989: 
128) 


This account thus predicts that animate and/or definite objects, which represent a 
less natural (i.e. more marked) combination of role and semantic features, should be 
marked formally, e.g. with an overt case marker (or by some other means, e.g. a passive 
or inverse construction), while inanimate and/or indefinite objects, which manifest a 
natural combination, need not be marked overtly (cf. Comrie 1989: 128; Bossong 1991: 
162-163; Malchukov 2008). 

There are several operational definitions for functional markedness. Bickel et al. (2015: 
10), for example, adopt the interpretation of markedness in the context of DSM following 
Silverstein's (1976) lead. They speak of markedness relations and operationalize them in 
terms of the alignment of argument roles: the sets that also include the S argument role 
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(i.e. (S, P}, (S, A} and (S, A, P}) are all less specific and thus less marked in comparison to 
the sets {A} and (P). They test the often claimed effects of various referential hierarchies, 
such as the ones in Table 1. High-ranking A and low-ranking P arguments are then 
expected to be associated with the more general sets, i.e. (S, A, P} or (S, A} for high- 
ranking A arguments, and (S, A, P} or (S, P) for low-ranking P arguments. 

A more radical view is Haspelmath (2006) who discards markedness altogether, replac- 
ing it with frequency-based expectations. This approach can be straightforwardly related 
to DAM because it provides a falsifiable account of asymmetries mentioned above. For 
example, animate direct objects are much less frequent and, hence, less expected to oc- 
cur, while objects are typically inanimate. For example, Dahl & Fraurud (1996: 51) and 
Dahl (2000) demonstrate that the proportion of animate vs. inanimate direct objects in 
corpus of written Swedish is 87% inanimate while in spoken Swedish 89% inanimate vs. 
13% and 11% animate NPs, respectively (analogical proportions are found in English and 
Portuguese, cf. Everett 2009: 6, 12). This means that animate objects are less expected to 
occur. This is claimed to be the reason why they require more marking than inanimate 
objects which are expected anyway. In turn, the A position seems to be less biased. There 
are 56% human A's vs. 44% non-humans in the same corpus (Dahl & Fraurud 1996: 51). 

Systems of case marking fulfilling a purely distinguishing function are infrequent syn- 
chronically (de Hoop & Narasimhan 2005; de Hoop & Malchukov 2008: 569). These are 
the systems of the kind described in 82.1.5 under scenario; apart from Aguaruna, other 
known examples are Awtuw (Feldman 1986) or Hua (Haiman 1979). Contrary to what one 
would expect from the perspective of the distinguishing function, in the majority of DAM 
systems a particular argument marking applies mechanically across the board and is not 
restricted to marking arguments only in contexts of actual ambiguity (cf. Malchukov 
2008: 213). However, even in these cases, the distinguishing function does seem to be 
operative in the background because DAM rarely leads to syntactic ambiguities here. 

More generally, DAM provides a means for speakers to differentiate between various 
additional factors that are themselves secondary to the event and do not considerably al- 
ter the state of affairs. The exact semantic and/or pragmatic component that a particular 
DAM system contributes is sometimes difficult to discern precisely because differential 
marking does not significantly change the interpretation of the event. In turn, the versa- 
tility of DAM systems is smoothened by the simple, mostly binary opposition between 
two case-marking strategies which may be either complementarily distributed or one 
marking may be the semantic default that may be used in all contexts. 


5 Conclusions 


Differential marking is a pervasive phenomenon cross-linguistically. Thus, Sinnemaki 
(2014: 297) shows that, independently of genealogical and areal factors, the asymmetric 
DOM (restricted marking" in Sinnemáki 2014) is found in the overwhelming majority of 
languages that employ flagging of objects: 74% of all genealogical units in his large-scale 
study on DOM involving 744 languages attest splits in the object marking where only a 
subset of objects is overtly marked. 
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Moreover, this phenomenon is highly versatile. We have suggested that the two main 
types of DAM systems are the ARGUMENT-TRIGGERED DAM and the PREDICATE-TRIGGERED 
DAM with various subtypes: while the former is primarily sensitive to the interpretation 
of the respective participant (its semantic and pragmatic properties), the latter responds 
to the properties ofthe event: e.g. whether the event is seen as perfective or imperfective, 
whether it takes place in the past or in the present, whether it is referential or modal, 
construed as independent (and hence coded by the main predicate) or as in some way 
dependent on another event (and hence coded by an embedded predication), etc. It is only 
the argument-triggered type that falls under the NARROW DEFINITION of DAM (in (16) 
above) and has been at heart of research on DAM. 

Orthogonally to this distinction we made the distinction between RESTRICTED vs. UN- 
RESTRICTED DAM systems. The former ones are found if the DAM system does not ap- 
ply across the board but is limited to specific contexts such as particular constructions 
or particular verbs; the latter, in turn, have no such restrictions. Crucially, most of the 
functional explanations of DAM revolved around the argument-triggered DAM systems 
and are not applicable to the predicate-driven type. 

Furthermore, DAM systems may be classified into SPLIT, FLUID AND SPLIT-FLUID sys- 
tems, depending on the degree of obligatoriness and complementarity of the markers. 
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AOR aorist min minimal number 
ATT attenuative Aktionsart NARR narrative case 
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h honorific REL subject relativizer 
HIAF high affectedness Aktionsart REM remote 
INT interrogative RESTR restrictive 
INTENS intensifier SAP speech act participant 
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Chapter 2 


Differential object marking in Chichewa 


Laura J. Downing 


Góteborgs universitet 


In most Bantu languages, an object prefix can occur on the verb. In some Bantu languages, 
this object prefix has a purely anaphoric function, while in others it has an additional agree- 
ment function. Since Bresnan & Mchombo, Chichewa (Bantu N.31 Malawi) has been con- 
sidered a textbook example of a language where the object marker is “always an incor- 
porated pronoun and never a non-referential marker of grammatical agreement" (Bresnan 
& Mchombo 1987: 755). That is, in order for an overt nominal phrase (DP) to co-occur in 
the same sentence with an object prefix, the DP must be a dislocated Topic. Conversely, a 
dislocated object DP (a Topic) must be anaphorically bound to an object prefix. In this pa- 
per I present new Chichewa data showing that in modern colloquial Chichewa there is a 
human/non-human asymmetry in object marking. Human object DPs commonly co-occur 
with an object prefix, whether the object is a dislocated Topic or not, whereas non-human 
ones commonly do not co-occur with an object prefix, even when they are dislocated Top- 
ics. I conclude that Chichewa shows differential object marking (or object indexation), as hu- 
manness is a more important condition on the occurrence of object prefixes than word order. 
The implications of the Chichewa (and other Bantu) data for recent proposals like Creissels 
(2006), Dalrymple & Nikolaeva (2011) and Iemmolo (2013; 2014) about the diachronic devel- 
opment of DOM agreement systems from anaphoric Topic marking systems are discussed, 
and an alternative constraints-based account is proposed. 


1 Introduction 


Object markers, commonly found in Bantu languages, are part of a complex string of pre- 
stem verbal inflectional prefixes, which include an obligatory subject prefix and tense- 
aspect-mood (TAM) prefixes. Object markers, when they occur, are positioned imme- 
diately before the verb stem, as illustrated in the Swahili example below.! (The object 
marker is bolded): 


There are 500+ Bantu languages spoken over a huge geographic area, so, not surprisingly, this generaliza- 
tion about the position of object markers does not hold for all Bantu languages. Rather, it holds for the 
languages spoken in the eastern and southern parts of the Bantu region. This paper concentrates on lan- 
guages from this area. See Marten & Kula (2012) and Beaudoin-Lietz et al. (2004) for more discussion of 
the variation in the position of object markers. 


Laura J. Downing. Differential object marking in Chichewa. In Ilja A. Seržant & 
| Alena Witzlack-Makarevich (eds.), Diachrony of differential argument marking, 35- 
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(1) a. Structure of the Bantu verb (Meeussen 1967; Nurse 2003) 
Subject - TenseAspectMood - (Object) - [ste Root (-Extensions)-Final Vowel 


b. Swahili (Bantu; Riedel 2009: 4) 
A-li-wa-[stemon-a. 
CL1SBJ-PST-CL20OBJ-see-FV 
‘S/he (class 1) saw them (class 2)? 


The form of both subject and object markers is determined by the concord class of the 
noun they refer to. Each noun concord class is traditionally assigned a number. In the 
interlinear glosses in (1b), for example, cL1sBJ labels a subject marker from class 1; cL20BJ 
labels an object marker from class 2. 

As we can see in (Ib), object markers can function like incorporated pronouns, per- 
forming the function of independent pronominal words in languages like English. Work 
like Givón (1976), Bresnan & Mchombo (1987), and Creissels (2006) indeed agrees that 
Bantu object markers have most plausibly developed historically from the grammatical- 
ization of independent pronouns. Creissels (2006: 44-45) proposes that there are three 
stages in the further evolution of the function of object markers cross-linguistically: 


(2) Stagell: the object marker has a purely anaphoric function, as it cannot occur 
within the limits ofthe clause [TP/IP] containing an overt co-referential 
object DP. 

Stage Il: the object marker acquires an additional agreement function, as it obli- 


gatorily occurs, even if the clause contains a co-referential object DP. 
It retains an anaphoric function as it can also represent, on its own, a 
co-referential DP that is not contained within the limits of the clause. 

Stage Ill: at this stage, the pronominal marker has a purely agreement function, 
as it cannot represent on its own a co-referential DP not contained 
within the limits of the clause. 


Since Bresnan & Mchombo (1987), Chichewa (Bantu N31 Malawi) has been considered 
a textbook example of a Stage I language. The object marker is "always an incorporated 
pronoun [anaphor] and never a non-referential marker of grammatical agreement" (Bres- 
nan & Mchombo 1987: 755). In order for an overt DP to co-occur in the same sentence 
with an object marker, the DP must be a dislocated Topic in their analysis. Conversely, a 
dislocated object DP (a Topic) must be anaphorically bound to an object marker (Bresnan 
& Mchombo 1987: 749). 

In this paper I present new Chichewa data showing that, in fact, modern colloquial 
Chichewa is a Stage II language, with a human/non-human asymmetry in object mark- 
ing: human object DPs commonly co-occur with object marking, whereas non-human 
ones commonly do not. I conclude that Chichewa shows differential object marking (or 
object indexation), as humanness is a more important condition on the occurrence of 
object markers than word order. 
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2 Differential object marking in Chichewa 


The paper is organized as follows. First, in $2, I review Bresnan & Mchombo's (1987) 
diagnostics for purely anaphoric status of object markers. In 83, I show that Chichewa 
fails all of these diagnostics. Finally, in 84, I discuss the implications of the Chichewa 
(and other Bantu) data for recent proposals like Creissels (2006), Dalrymple & Nikolaeva 
(2011) and Iemmolo (2013; 2014) about the diachronic development of DOM agreement 
systems and develop a constraints-based account. 


2 Diagnostics for the anaphoric vs. grammatical 
agreement function of object markers 


2.4 Object marker is purely anaphoric 


Bresnan & Mchombo (1987) propose the following diagnostics that determine whether 
object markers are purely anaphors, referring to Topics and other DPs (nominal phrases) 
outside the clause in a particular language. (This corresponds to Creissels's 2006 Stage I): 


(3) Diagnostics for anaphoric use of object markers: 

a. Word order: the occurrence of the object marker correlates with non-canoni- 
cal word order; more precisely, only dislocated DPs are resumed with object 
markers and dislocated DPs must be resumed with object markers. 

b. Focused elements: cannot be referred to with an object marker. 

Prosody: an object DP resumed by an object marker is considered anaphoric 
if the object is phrased separately from a preceding object-marked verb. 


If the object marker meets these tests, then the object marker is anaphoric. Any overt 
object DP which co-occurs with an object marker must be dislocated. Any dislocated ob- 
ject DP must be licensed with (anaphorically bound to) an object marker. Object markers 
have been argued to have a primarily anaphoric function, using these sorts of criteria, 
in Bantu languages like: Haya (Duranti & Byarushengo 1977), Northern Sotho (Zerbian 
2006), Tswana (Creissels 2006), Zulu (Buell 2005; Cheng & Downing 2009; Schadeberg 
1995; van der Spuy 1993; Zeller 2012) and Swati (Marten & Kula 2012). Indeed, Creis- 
sels's (2006) claims that Stage I object markers are very common in African languages 
generally.” 

The diagnostics for purely anaphoric use of the object marker are illustrated with data 
from Zulu (Cheng & Downing 2009). Canonical word order in Zulu is: S VIO DO Oblique. 
As shown by the Zulu data in (4) and (5), both left and right dislocations of object DPs 
are easily elicited by asking content questions on a verb complement. Both the content 
question word or particle and the answer to the content question (which have inherent 
focus) must occur immediately after the verb. A non-focused verb complement must 
be displaced from its canonical postverbal position either to preverbal position or to 
a position following the element in immediately after the verb position. Note that we 


?See also Riedel's (2009), Marten & Kula’s (2012) and van der Wal's (2015) recent cross-Bantu surveys. 
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find an obligatory object marker referring to an object or direct object which has been 


displaced from its canonical position.? 


(4) Zulu left dislocations (Bantu; author's elicitation notes) 
Wh-questions 


Q (Ámá-bhayisékíl' ^ u-wá-níkée ó-baani)? 
CL6-bicycle 28GSBJ-CL6OBJ-give.PRF  CL2-who 


"Whom did you give bicycles to?' 


A (Ámá-bhayisékiili ) | (si-wá-níkée ábá-ntwaana). 
CL6-bicycle IPLSBJ-CLÓOBJ-give.PRF  CL2-child 


"We gave bicycles to the children? 


(5) Zulu right dislocations (Bantu; author's elicitation notes) 
Wh-questions 
Q ((Ízí-vakáashi) (zí-yí-thengelée-ni) imi-ndeni yáazo) ? 
CL8- visitors CL8SBJ-CL40BJ-buy.for.Prr-what  cr4-families CL4.their 
“What did the visitors buy for their families?” 


A ((Ízi-vakáshí zí-yí-thengelé izi-nguubo)  imi-ndeni 
CL8-visitors  CL8SBJ-CL40BJ-buy.for.prr  cr10-clothes  cr4-families 
yáazo). 

CL4.their 


"Ihe visitors bought clothing for their families? 


Evidence that the objects resumed with an object marker (underlined) in (5) are dislo- 
cated is that, first, they are set off prosodically from the rest of the sentence. As Cheng & 
Downing (2009) show, the main evidence for the prosodic phrasing (indicated by paren- 
theses) is lengthening of the phrase penult vowel. (Vowel length is not contrastive in 
Zulu.) Furthermore, IO DO word order is strictly respected in broad focus sentences. 
The DO IO order in (5) is only felicitous if the DO is in focus and IO is out of focus. 
As Cheng & Downing (2009) and Cheng & Downing (2012) argue, non-focused material 
cannot occur within the vP in Durban Zulu. While dislocated objects must be resumed 
with an object marker, objects in focus (and therefore in IAV position) cannot be resumed 
with an object marker. This is shown by the infelicitous sentence in (6a), where the object 
marker zi- refers to “visitors”, the word in focus, rather than to “chicken”, old information 
repeated from the question (and dislocated out of the vP): 


3The accent marks on vowels in the data indicate high tone; long vowels are indicated by doubling the 
vowel. In the morpheme glosses, numbers indicate noun concord class, following the standard Bantu sys- 
tem adopted in work like Mchombo (2004). Dislocated elements are underlined, and object markers are 
bolded. 
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(6) Zulu (Bantu; author's elicitation notes) 


a.Q ((U-siipho)  (ü-yí-phékéla BAANI) ín-kuukhu) ? 
cLi-Sipho  CL1SBJ-CL9OBJ-cook.for criwho  crog-chicken 
"Who is Sipho cooking the chicken for?” 
b.A ((Ü-síph'  ú-yi-phékél IZI-VAKAASH’)  ín-kuukhu) . 
CLi-Sipho  CL1SBJ-CL9OBJ-cook.for  CL8-visitor CL9-chicken 


‘Sipho is cooking the chicken for the visitors. 
a. #U-siph’ ^ á-zí-phékél ÍZI-VAKÁASH' | in-kuukhu. 
cLi-Sipho  CL1SBJ-CL8OBJ-cook.for  cL8-visitor CL9-chicken 


(Object marking would only be acceptable with the word order in (6a) as the answer 
to a question like, ^What did Sipho do with the chicken for the visitors?" where 'visitors' 
is topical, given information.) The data set in (6) demonstrates especially clearly that in 
Zulu we find the correlation between object marker and topical (or dislocated, out of 
focus) status of the co-referential object that Bresnan & Mchombo (1987) and Creissels 
(2006) have proposed characterize the object marker in languages where it has a purely 
anaphoric function. (This corresponds to Creissels 2006's Stage I.)* 


2.2 Object marker is also a grammatical agreement marker 


As far as I know, in all Bantu languages, the object marker can have an anaphoric 
(pronominal) function, resuming objects that occur earlier in the discourse as well as 
(at least some) topical, dislocated objects. The object marker also has a grammatical 
agreement-like function in some Bantu languages: it can co-occur with a co-referential 
object within the same TP/IP (i.e., roughly, a clause)? Languages where this has been 
demonstrated include Bemba (Marten & Kula 2012), Swahili, Sambaa, Chaga (Riedel 2009: 
59), Chimwiini (Kisseberth & Abasheikh 1977) and Manyika Shona (Bax & Diercks 2012). 
For example, in Swahili, as we saw in (1b) object markers can serve an anaphoric func- 
tion, resuming objects mentioned earlier in the discourse. They also serve a grammatical 
agreement function: object marking is obligatory with overt human objects - (7a) - and 
common with definite non-human objects - (7c):° 


“Though Zeller (2012) provides some problematic examples, showing humanness plays a role in object mark- 
ing in Zulu for some speakers in some grammatical contexts, the consensus in the Zulu literature is that 
object marking correlates with dislocation of the object DP. See van der Spuy (1993); Cheng & Downing 
(2009); Schadeberg (1995), and Buell (2005) for discussion. 

"See Morimoto’s (2002), Riedel's (2009), Marten & Kula’s (2012) and van der Wal's (2015) recent surveys 
of the variation in the function and distribution of pre-verb stem object markers, illustrating a range of 
possibilities from Creissels (2006) Stage I to Stage II. (As Creissels 2006 notes, Stage III is not common in 
the languages of the world.) 

SObject marking might not be as obligatory in colloquial Swahili as traditionally described, see Seidl & 
Dimitriadis (1997) for discussion. 
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(7 Swahili (Bantu; Riedel 2009: 42, 46) 


a. Ni-li-mw-ona mwanawe. 
1SGSBJ-PST-CLIOBJ-see  cLi.child.Poss.3sc 


‘T saw his child. 

b. *Ni-li-ona mwanawe 

c. Ni-li-zi-ona picha hizo. 
1SGSBJ-PST-CL10OBJ-see CL10.picture CL10.those 


'I saw those pictures: 


Riedel (2009) affirms that the object marker in these examples occurs even though the 
overt object is in its base position, and no prosodic break separates the object-marked 
verb from the overt object. Bantu languages with grammatical agreement-like object 
marking show a great deal of variation as to whether the markers are obligatory or op- 
tional. The unifying generalization is that agreement-like object markers co-occur with 
human or animate objects or with definite objects. (See, e.g. Duranti 1979; Bentley 1994; 
Morimoto 2002; Riedel 2009; Marten & Kula 2012; van der Wal 2015). That is, agreement- 
like marking of objects in Bantu languages is conditioned by the topicality hierarchies 
in (8):7 


(8) Topicality hierarchies (Hyman 8 Duranti 1982: 224) 
a. Benefactive » Recipient » Patient » Instrument 
b. 1% > 29d > 3'd human > 3!d animal > 3" inanimate 
c. definite > indefinite 


These hierarchies have also been shown to play a central role in defining other object 
properties in Bantu languages (Duranti 1979; Hyman & Hawkinson 1974; Hyman & Du- 
ranti 1982), and in conditioning differential object marking in a number of typologically 
diverse languages. (See e.g. Comrie 1981; 1989; Aissen 2003; lemmolo 2013; 2014). Creis- 
sels (2006: 48-49) qualifies Bantu languages like Swahili as in transition from Stage I to 
Stage II because agreement object markers are not entirely obligatory. This is because 
only some types of objects - human and definite - show agreement-like object mark- 
ing in Swahili. He notes that pure Stage II object marking systems are not common in 
African languages, but provides no explanation for why this might be so. I take up the 
discussion of how languages might change from anaphoric object marking to a system of 
differential grammatical agreement object marking in 84. First, I review the distribution 
of object marking in modern colloquial Chichewa. 


7See Witzlack-Makarevich & SerZant (2018 [this volume]) for a detailed overview of the role of different ver- 
sions ofthe hierarchies in (8) in accounting for DOM. While the term topicality hierarchy is well-established 
in the literature, a number of other terms are also in current use, as Witzlack-Makarevich & Serzant (2018 
[this volume]) make clear. 
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3 The function of object markers in Chichewa: anaphoric 
or grammatical agreement? 


As noted above, since Bresnan & Mchombo (1987), Chichewa is considered to be a pro- 
totypical Stage I language: the object marker is always an anaphor and signals that the 
cooccurring object does not occur within the same VP as the object marker. Furthermore, 
dislocated objects must be resumed by an object marker. Recall that these claims about 
the pronominal status of the object marker are based on the diagnostics in (3). In this 
section, I present new Chichewa data, recently elicited in Malawi? As we will see, object 
marking in modern colloquial Chichewa fails all three of Bresnan & Mchombo's (1987) 
diagnostics for anaphoric status. Instead, it shows differential object marking properties. 
I take up Bresnan & Mchombo's (1987) diagnostics one by one below. 


3.1 Changes in word order and object marking 


In Chichewa, as in most Bantu languages, the basic word order is: (Subject) Verb (Ob- 
ject1) (Object2) (Oblique). (See, e.g. Heine 1976; Bearth 2003; Downing & Hyman 2016). 
Chichewa allows multiple objects, with a non-theme (e.g. benefactive) object generally 
preceding the theme object. Adverbials and other oblique arguments are found at the 
periphery of the main clause. According to Mchombo (2004), nothing can separate an 
object nominal from the preceding verb, unless the verb is object-marked. 

In my corpus one frequently finds examples where a co-referential object marker 
on the verb resumes a dislocated object DP. (Parentheses continue to indicate prosodic 
phrasing.)? This data is consistent with Bresnan & Mchombo's (1987) diagnostics for the 
purely anaphoric status of object marking given in (3): 


(9) Left dislocations 
Chichewa (Bantu; author's elicitation notes) 


a. ((Chi-máangá ) (á-chí-lima nyengo 
CL7-maize CLISBJ.PST-CL7OBJ-cultivate | CL9.season 
i-ku-bwélaa-yi) ((ndipó fóodya ) 
CL9-PROG-come-CL9.REL and CL3.tobacco 
(a-dzá-mú-lima nyengo Ínáayo). 


CLISBJ-FUT-CL3OBJ-cultivate | CL9.season  CL9.next 


“Maize she cultivated this season, and tobacco she will cultivate next season? 


$The data was collected using an elicitation questionnaire for an investigation that had as its original aim to 
describe the prosody of dislocated nominals. However, once I noticed that the use of object markers did not 
match Bresnan & Mchombo's (1987) description, I re-elicited data from Bresnan & Mchombo (1987) to test 
their diagnostics for the distribution of object markers on this set of speakers. The elicitation interviews 
were conducted in Malawi in 2011 and 2013, primarily with four native speakers of Chichewa aged between 
22 and 40 years old. The resulting corpus investigating the distribution of object markers consists of some 
50—75 sentences per speaker. Pascal Kishindo, Professor of Chichewa syntax at Chancellor College and a 
native speaker of Chichewa, kindly checked the corpus and has confirmed that all the examples cited in 
this section are grammatical. 

?See Cheng & Downing (2016) for justification of the prosodic phrasing indicated in these examples. 
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b. (Mwaná  wódwálaa-yo ) (á-kú-mu-téngéla ku  chipataalá) 
cLLgirl  CL1.sick-CLIDEM CL1SBJ-PROG-CL10BJ-take.to roc cr7.hospital 
(ndi ndaání) 
coP who 
“That sick child, (the one) taking her to the hospital is who?” 


(10) Right dislocations 
Chichewa (Bantu; author's elicitation notes) 


a. (((Pa tébuuló)  (wa-zí-ika) mtsíkaana) mbaale ). 
LOC  CLs5.table CL1SBJ.PST-CLIOOBJ-put  crigirl CL10.plate 


‘On the table, [she] put them, the girl, plates: 


b. Chichewa (Bantu; author's elicitation notes) 


((((Udzuüdzü)  (u-na-wá-lúmá kwámbíili) pa ` nyaánjá) 
CL14.mosquito CL14SBJ-PST-CL20BJ-bite much LOC  CL0.lake 
dzuuló) a-soodzi). 


yesterday  CL2-fisherman 


"Ihe mosquitoes bit them a lot on the lake yesterday, fishermen: 


We find many examples, though, where the occurrence of the object marker does not 
correlate with dislocation of the co-referential DP. Human objects are often resumed by 
an object marker, even when they are in their base position, immediately following the 
verb. In (11), the same sentence is given with four different word orders. Note that no 
prosodic break separates the overt object from the verb in these examples, and there is 
no other evidence that the overt object is dislocated in any of the sentences:% 


(11) Chichewa (Bantu; data re-elicited from Bresnan & Mchombo 1987) 
a. (Njüuchí) (zi-na-lámá a-leenje). 
CLio.bee | cr10sBj-PST-bite  CL2-hunter 
b. (Njüuchí) (zi-na-wá-láma a-leenje). 
CLio.bee ^ cL10sBJ-PST-CL20BJ-bite  cr2-hunter 
c. ((Zi-na-lúmá a-leenje) njúuchi). 
CLIOSBJ-PST-bite CL2-hunter CL10.bee 
d. ((Zi-na-wá-lúma a-leenje) njúuchi). 
CL1OSBJ-PST-CL20BJ-bite CL2-hunter CL10.bee 
“The bees bit the hunters. 


101 am not the first to observe that object markers can co-occur with in situ (human) objects in Chichewa. 
Indeed, Bresnan & Mchombo (1987) mention this possibility in a footnote. Bentley (1994) and Henderson 
(2006) also provide a few examples. As far I know, this paper is, though, the first attempt to systematically 
document the role of humanness in conditioning object marking in Chichewa. 
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The point they illustrate is that it is acceptable for the object marker wa- to co-occur 
with the object it refers to, “hunters”. Both the sentences containing wa- - (11b) and (11d) 
- and the ones omitting it — (11a) and (11c) - are judged grammatical by all the speakers 
I have asked, even though, according to Bresnan & Mchombo (1987), the versions with 
the object marker should not be acceptable. More examples of the use of object markers 
with in situ objects are given below. (Note that in Chichewa, unlike in Zulu, objects in 
focus are not required to occur in immediately after the verb position): 


(12) Chichewa (Bantu; author's elicitation notes) 
(M-zákee-yó) (a-na-mú-pátsá Maliya chóóváala). 
CLi-friend-CLIDEM  CL1SBJ-PST-CL1OBJ-give CL1.Mary  CL7.dress 


“Her friend gave Mary a dress. 


(13) Chichewa (Bantu; Downing & Mtenje 2011: 84, 91) 
a. (Ndi zóóváala)  (zi-méné  a-lendó á-ná-mu-gulílá 
coP ct8.clothes CL8-REL  CL2-visitor  CL2SBJ-PST-CL1OBJ-buy.for 
m-phunzitsii-zo). 
CL1-teacher-CL8.REL 


‘It is clothes that the visitors bought for the teacher: 


b. ((Ti-na-kámána nd’ | áá-méné  á-ná-mu-óná Báanda) 
we-PsT1-meet with  CL2-REL  CL2SBJ-PST-CLIOBj-see cri Banda 
dzuulo). 
yesterday 


"We met the ones who saw Banda yesterday: 


Human objects are commonly resumed with an object marker whether they precede 
or follow a content question word like chiyáani ‘what’; word order has no effect on the 
occurrence of object marking: 


(14) Chichewa (Bantu; author's elicitation notes) 


a. ((Mu-ku-wá-phíkila chiyáani) aáná)? 
you-PROG-CL20BJ-cook.for what CL2.children 
b. ((Mu-ku-wá-phíkila aáná) chiyáani)? 


you-PROG-CL20BJ-cook.for cr2.children what 


"What are you cooking for the children?' 


Another problem for the anaphoric status of object markers posed by this data is that 
non-human objects are not systematically resumed with an object marker. This is true 
even in contexts where they meet diagnostics for dislocation, such as preverbal position: 
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(15) Preverbal objects 
Chichewa (Bantu; author's elicitation notes) 
a. ((U-nga-kumbukila kuti búkuu-li) a-ná-gulá-di 
you-can-remember that CL5.book-cL5.this CL1SBJ-PST-buy-EMPH 


Blántaayá)? 
Blantyre 
“Can you remember whether she bought this book in Blantyre?” 

b. (Chí-mánga  á-líma ch-aka ch-iinó)  (ndipó 
CL7-maize CLISBJ.PST-cultivate CL7-season  cr7.his and 
fódya a-dzá-líma ch-aka chá máawa). 


CLs.tobacco  crisBj-rFuT-cultivate CL7-season cL7.of next 


ku 
LOC 


*Maize, she will cultivate this season, and tobacco she will cultivate next 


season. (cf. (9a)) 
c. (Kodi makáala) (u-náa-gula kuuti)? 
Q cL6.charcoal you-PsT-buy where 


“Where did you buy charcoal?” 


Non-human objects are also not necessarily resumed with an object marker when they 
follow a postverbal temporal adjunct. This is another position where they are clearly 
dislocated, since objects otherwise cannot be separated from the verb by an adjunct in 


Chichewa (Mchombo 2004): 


(16) Postverbal, post adjunct object 
Chichewa (Bantu; author's elicitation notes) 


a. Context: 'When will s/he write a to the school? 


((A-dzá-lémba máa'wá) káláta yópítá ku | suküulu). 
CLISBJ-FUT-write tomorrow  croletter CL9.of.INF.go Loc  crs.school 


‘S/he will write a letter to the school tomorrow: 


b. Context: Can you also play the drums? 
(inde) (ndí-ma-yímba BWINO  ng'ooma ). 
yes I-HAB-play well cL10.drum 


"Yes, indeed, I play the drums well? 


In some cases, a consultant would even pronounce the verb with and without the 


object marker in successive repetitions of the same sentence: 


(17) Chichewa (Bantu; author's elicitation notes) 


a. Context: Where did you buy the charcoal? 
((Ta-gula KU  MSIIKÁ) makáala ). 
we.PsST-buy Loc  C13.market cr6.charcoal 


"The attentive reader will have noticed that there are a number of different past tenses, all labeled pst. I 


have not labeled them more specifically, as choice of tense does not condition object marking. 
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b. ((Ta-wá-gula KU  MSIIKÁ) makáala ). 
we.PST-CL6OBJ-buy Loc  CL3.market CcL6.charcoal 


"We bought the charcoal at the market. 


Following a content question word (or other word) in immediately after the verb po- 
sition (indicated with capital letters), an object marker is again not obligatory for a non- 
human object: 


(18) Chichewa (Bantu; author's elicitation notes) 

a. (((Kodí azi-bambo | a-na-nyámüla BWÁANJI) makáala ) ku 
Q CL2-man CL2.SBJ.PST-Carry how CL6.charcoal roc 
msiika)? 
CL3.market 
“How did the men carry the charcoal to market?” 

b. (((Kodí m-tsíkana | a-naká-chápá KUUTI) zóoválá zá 
Q cL1-girl CLISBJ-PST-wash “where c18.clothes cL8.of 
á-máy' aáké)? 
cL2-mother  cr2.her 


"Where did the girl wash her mother's clothes on Sunday?' 


According to my language consultants, there is no difference in interpretation, whether 
the object marker is present or not. This overabundance of human object marking com- 
pared to non-human is also found in relative clauses. As studies of Chichewa relative 
clauses like Downing & Mtenje (2011), Henderson (2006), and Mchombo (2004) show, hu- 
man indirect object heads are obligatorily resumed with object marking on the relative 
verb (19a); human direct object heads are commonly resumed (19b); while non-human di- 
rect object heads are not resumed (19c). (Ihe facts regarding non-human indirect object 
heads need further study.) 


(19) Chichewa (Bantu; Downing 2010, Downing & Mtenje 2011: 76, 78. The RC is un- 
derlined.) 
Human head of RC — object marking 
a. ((A-lendó  a-méné  á-ná-wa-bweretsérá m-pháatso ) 
CL2-visitor CL2-REL  CL2SBJ-PST-CL20BJ-bring.for CL10-gift 


a-koondwa). 
CL28BJ-be.happy 


"Ihe visitors who they brought the gifts for are happy: 


b. ((A-lendó | a-méné Banda á-ná-wá-óná ku sukuulu ) 
CL2-visitor CL2-REL  cLiBanda  CLisBJ-PST-CL2O0BJ-see LOC  CL5.school 
a-piítá). 

CL2SBJ-go 


“The visitors who Banda saw at the school have gone. 
Non-human head of RC - no object marking 
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c. ((M-waná wá sükülü  a-ná-lémba káláta i-méné 
CLi-child ct1.of school crisBJ-PsT-write ctg.letter CL9-REL 


m-phunzitsi ^ á-ná-weléenga ) kwá  a-nyúuzi). 
cLi-teacher  CLI1SBJ-PsT-read for  CL2-newspaper 


“A student wrote the letter which the teacher read for the newspaper? 


3.2 Object markers and focus 


As Bresnan & Mchombo (1987) argue, if the object marker in Chichewa were a Stage I, 
purely anaphoric agreement marker, it should never be co-referential with an element in 
focus. However, we find object marking for human words in focus: e.g. content question 
words and the answers to content questions, as shown by the data below: 


(20) Chichewa (Bantu; author's elicitation notes) 


a. (Kodi) ((u-na-mú-óná NDAÁNI) ku  tchálítchi ` m'máawá)? 
Q 2SGSBJ-PST-CLIOBJ-see CLi.who Loc cr5.church roc.morning 
“Who did you see at church in the morning?” 

b. Q (Kodi ámáyi a-ná-m-pátsá NDANÍ ` ma-lalaanje)? 

Q cL2.mother CL2SBJ.PST-CLIOBJ-give criwho  CL6-orange? 


"Who did mother give the oranges to?' 
c. A ((Amáayi)  (a-ná-m-pátsá NZÁAWO) ma-lalaanje). 
cL2.mother  CL2sBJ.PST-CL1OBJ-give  cLi.POss.friend cL6-orange 


“Mother gave her friend the oranges’ 


Note in the following example that the dislocated non-human object kalata-yo is not 
resumed with an object marker, while the in situ, focused human object Prisca is: 


(21) Chichewa (Bantu; author's elicitation notes) 
Context: ‘Who did the teacher write the letter to?’ 
(Kálátaa-yó) (a-ná-mú-lémbera PRÍSCA). 
CL9letter-DEM  CL1SBJ-PST-CLIOBJ-Write.to  CL1.Prisca 


“That letter, the teacher wrote to Prisca: 


The by now familiar human vs. non-human asymmetry in object marking also holds 
in this focus context. It is considered ungrammatical to use an object marker with a 
non-human content question word: 


(22) Chichewa (Bantu; author's elicitation notes) 
a. (Kodi mu-ku-fúná chiyáani)? 
Q 2SGSBJ-PROG-want  CL7.what 
"What do you want?' 
b. *Kodi mu-ku-chi-funa chiyani? 
Q 2SGSBJ-PROG-CL70BJ-want  CL7.what 
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As we see, it is humanness, not a topic-focus distinction, which conditions the occur- 
rence of the object marker. 


3.3 Prosodic phrasing and the occurrence of the object marker 


In Bresnan & Mchombo's (1987) analysis, prosody provides additional evidence that 
an object nominal that co-occurs with a co-referential object marker is dislocated. A 
prosodic break signals the syntactic constituent edge preceding a right-dislocated DP, 
which, in their account, is always resumed by an object marker. Work like Bresnan & 
Mchombo (1987) and Kanerva (1990) demonstrate that there are two kinds of system- 
atic evidence for prosodic phrase breaks in Chichewa: significant lengthening of the 
phrase penult vowel and tonal alternations, such as final high tone retraction, high tone 
spread blocked, related to penult lengthening. Recall that the Zulu data in (4)—(6) illus- 
trate the expected prosodic break preceding a (right-)dislocated object DP (underlined), 
which is obligatorily resumed by object marking on the verb. An example is repeated 
here for convenience; notice the phrase penult lengthening on the word preceding the 
right-dislocated object: 


(23) Prosody and right dislocation in Zulu (Bantu; author's elicitation notes) 
Q ((Ízí-vakáashi) (zí-yí-thengelée-ni) imi-ndeni yáazo)? 
CL8-visitors CL8SBJ-CL40BJ-buy.for.prr-what  cr4-families c14.their 
"What did the visitors buy for their families?” 


A. ((Ízí-vakáshí  zi-yi-thengelé izi-nguubo) | ímí-ndeni 
CL8-visitors  CL8SBJ-CL40BJ-buy.for.prr  cr1o-clothes CL4-families 
yáazo) 
c14.their 


“The visitors bought clothing for their families’ 


However, the attentive reader will have noticed in the Chichewa data presented in 
the preceding sections that we do not always find a prosodic break before an object 
resumed by an object marker. We also do not always find an object marker resuming 
objects that are set off by a prosodic break. In (24a), for example, there is a break, but no 
object marker. Note the penult vowel lengthening and the continuation high tone on ku 
msiiká, the word before the dislocated object, confirming the prosodic phrase break in 


both (24a) and (24b)): 


(24) Prosody and right dislocation in Chichewa (Bantu; author's elicitation notes) 


a. [Context: Where did you buy the charcoal?] 
((Ta-gula KU MSIIKA) ma-kaala). 
we.PST-buy Loc  CL3.market c16-charcoal 


b. ((Ta-wa-gula KU MSIIKA) ma-kaala). 
we.PST-CL6OBJ-buy roc cL3.market  CL6-charcoal 


“We bought the charcoal at the market. 
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To support these claims about the lack of correlation between prosody and object 
marking, three representative pitch tracks are given below. Figures 1 and 2 illustrate 
the prosody for the two sentences in (25). Note that there is no obvious prosodic break 
following the verb and human object DP, in its base position, whether the verb is object- 
marked (as in (25b) or not (as in (25a)). Compare the length of the penult vowel in njuuchi, 


which does precede a break with the penult vowel in the verb in the two examples: 


(25) 


And as shown by the pitch track in Figure 3, in the sentence in (26), there is a break 
setting off the overt object in preverbal position - it is clearly in a non-canonical position 
- yet we find no co-referential object marker on the verb. Instead, the in situ, focused 
object is resumed with an object marker. However, as we can see, the penult vowel of 
the verb is quite short, and there is no other evidence for a prosodic break following the 
verb. The postverbal object must be in its canonical, verb phrase-internal position. This 
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a. without object marker 


(Njuuchi)  (zi-na-lümá a-leenje). 
clio.bee | cr10sBJ-PST-bite cL2-hunter 


. with object marker 


(Njuuchi)  (zi-na-wá-lúma a-leenje). 
CLio.bee  CLIOSBJ-PST-CL20BJ-bite  cr2-hunter 


“The bees bit the hunters. 
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a-ná-mú-lémb Príisca 


That letter s/he wrote to Prisca 
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Figure 1: Example (25a), without object marker 
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zi-na-wá-l ú | ma a-leenje 


bit them hunters 
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Figure 2: Example (25b), with object marker 


is an especially striking piece of data confirming that humanness trumps other factors 
in conditioning object marking. 


(26) [Context: “Who did the teacher write the letter to?'] 
(Kálátaa-yó) (a-ná-mú-lémbera PRÍISCA). 
CL9letter-DEM — CL1SBJ-PST-CL10BJ-write.to  CL1.Prisca 


“That letter, s/he wrote to Prisca’ 


To sum up this section, object marking in modern colloquial Chichewa fails all three 
of Bresnan & Mchombo's (1987) tests for purely anaphoric status. There is a striking 
tendency for object markers to co-occur with human objects, whatever their position. 
Object markers do not obligatorily occur, however, with non-human objects, whatever 
their position. Prosodic breaks do not systematically set off objects that are co-referential 
with object markers. Chichewa object marking is therefore not purely anaphoric. Rather, 
itis at Stage IL in Creissels (2006)' terms (see (2)), and, moreover, shows differential object 
marking properties. 
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| a-ná-mú-lémb e| ra Príisca 


That letter s/ he wrote to Prisca 


0 2.168 
Time (s) 


Figure 3: Example (26) 


4 Implications for diachronic development 


Although object marking in Chichewa no longer has a purely anaphoric function, the 
literature on the diachronic development of DOM systems from Givón (1976) onwards 
agrees that the agreement-like object marking shown in the modern colloquial Chichewa 
data most likely develops from the grammaticalization of anaphoric marking of topical 
objects. This section takes up two recent approaches to grammaticalization of object 
marking, Creissels (2006) and Dalrymple & Nikolaeva (2011). I show that neither straight- 
forwardly accounts for the Chichewa data, and I propose an alternative, constraints- 
based approach. 


4.1 Creissels (2006) 


Creissels's (2006)' typology of the diachronic development of object marking given in (2) 
recognizes two end points — anaphor and agreement - in the diachronic development 
of object marking systems. These are his Stage I and Stage III, respectively. In Stage II, 
the intermediary stage, object marking retains anaphoric properties and also extends 
its functions to mark grammatical agreement. As the data shows, Chichewa does not fit 
into any of Creissels's (2006) three stages. The reason Chichewa poses a problem for this 
approach is the same one mentioned in discussing Swahili in 82.1, above. Pure Stage II 
Bantu languages are not found because this stage does not take into account the role of 
the topicality hierarchies (8) in conditioning the occurrence of the object marking with 
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a co-referential overt object. The stages are defined purely in terms of the morphosyn- 
tactic distribution of the object markers. This oversight in Creissels's (2006) typology is 
surprising. Work on Bantu and other languages - like Duranti (1979); Bentley (1994); Mo- 
rimoto (2002); Aissen (2003), Riedel (2009), Marten & Kula (2012) and van der Wal (2015) 
- clearly establishes the role of the hierarchies in (8) in conditioning object marking. 
Indeed, much of the original work on the hierarchies in (8) from the early 80's inves- 
tigated object properties in Bantu languages (e.g. Hyman & Hawkinson 1974; Duranti 
1979; Hyman & Duranti 1982). And recent surveys of Bantu object marking (Morimoto 
2002; Riedel 2009; Marten & Kula 2012; van der Wal 2015) confirm that one can classify 
object marking in different Bantu languages according to different cut off points along 
the topicality hierarchies. None of these authors report a Bantu language where object 
marking obligatorily indexes all indefinite non-animate entities (along with objects with 
features high in the hierarchies in (8)).!7 What is missing in Creissels’s (2006) grammati- 
calization stages is an explicit formalization of the role of topicality features in triggering 
a transition from Stage I languages, where objects marking indexes topicalized objects, 
to Stage II languages, where objects with features high in the hierarchies in (8) (as well 
as topicalized objects) are marked. 


4.2 Dalrymple & Nikolaeva (2011) 


Chichewa is equally problematic for Dalrymple & Nikolaeva's (2011: 214-216) proposed 
grammaticalization paths for DOM.P In their approach, as in Creissels's (2006), the origi- 
nal situation is for only topical (i.e., clause-external) objects to be resumed with an object 
marker, while non-topical (clause-internal) ones are unmarked. (This is roughly equiva- 
lent to Creissels's (2006) Stage I.) DOM arises via two paths. Object marking can spread 
to nontopical objects with features that place them high on the hierarchies in (8): i.e., 
topic-worthy objects. This path, shown in (27), resembles Creissels's (2006) transition 
from Stage I to Stage II. 


(27) Spreading of DOM Dalrymple & Nikolaeva (2011: 215) 


topical nontopical topical nontopical 
marked unmarked marked marked unmarked 


12 As Marten & Kula (2012), following Stucky (1981) and van der Wal (2009) observe, Makhua represents an 
interesting case where further grammaticalization has occurred. Objects in class 1 and 2 (which is mainly 
occupied by human nouns) are marked whether they are human/animate or not. That is, the agreement 
class trumps semantic features like humanness in conditioning object marking. 

P Most of Dalrymple & Nikolaeva (2011) thoughtful work demonstrates the role of information structure in 
conditioning object marking: objects are marked in many languages if they are secondary topics. I know 
of only one Bantu language where information structure has been claimed to directly condition marking 
of non-dislocated objects. As Bax & Diercks (2012) demonstrate, in situ objects in Manyika Shona are 
marked if they are [-Focus]. Since Chichewa and most other Bantu languages mark objects with particular 
semantic features, I discuss here only Dalrymple & Nikolaeva's (2011) approach to grammaticalization, not 
their general approach to object marking. See lemmolo (2013; 2014) for a thoughtful critique of Dalrymple 
& Nikolaeva's (2011). 
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Dalrymple & Nikolaeva (2011: 215) suggest that spreading accounts for Bresnan & 
Mchombo's (1987) distinction between anaphoric and agreement function of object mark- 
ers in Bantu languages: spreading leads to the development of agreement-like properties. 
However, their approach improves on both Bresnan & Mchombo (1987) and on Creissels 
(2006) by making explicit the role of the hierarchies in (8) in motivating the marking 
of only certain nontopicalized objects, leading to a DOM system. The other scenario, 
schematized in (28), is for object marking to narrow. In this scenario, only a subset of 
topical objects (those with features high on the topicality hierarchies in (8) come to be 
marked, while other objects - whether topical or nontopical - are unmarked: 


(28) Narrowing of DOM Dalrymple & Nikolaeva (2011: 218) 


topical nontopical > topical nontopical 
marked unmarked marked unmarked unmarked 


Chichewa does not straightforwardly fit either of these scenarios, as object marking 
both spreads and narrows in Chichewa. Object marking spreads to nontopical objects, if 
they are human and therefore high on the topicality hierarchies in (8). Object marking 
also retracts from less topic-worthy objects, even if they are topical (i.e., in a position 
outside of the clause). A more general problem is that the second path - simple narrow- 
ing - is not consistent with the proposal that object marking in Bantu languages arises 
in stages along an anaphor-agreement continuum (Bresnan & Mchombo 1987; Creissels 
2006; Givón 1976). Creissels’s (2006) Stages II and III preclude the possibility of narrow- 
ing the object marking of anaphoric nominals without also spreading object marking 
to indicate grammatical agreement. And, indeed, I have not found any examples of sim- 
ple narrowing in the literature on Bantu object marking. The assumption is that object 
marking by default tracks topicalized (dislocated) objects, while agreement-like marking 
is the more restricted innovation. (See e.g. Riedel 2009; Bax & Diercks 2012; Marten & 
Kula 2012.) However, narrowing of object marking subsequent to spread can be seen as 
a logical progression in the development of a grammatical agreement system from an 
anaphoric one. What is missing from Dalrymple & Nikolaeva’s (2011) approach, then, is 
a way of placing their grammaticalization paths on an anaphor-agreement continuum. 


4.3 An alternative 


In this section, I propose an alternative account ofthe grammaticalization of object mark- 
ing in Chichewa which combines aspects of both Creissels's (2006) and Dalrymple & 
Nikolaeva's (2011) approaches. Following Iemmolo (2013; 2014), I propose that in Chi- 
chewa (and other Bantu languages where object marking is conditioned by the topical- 
ity hierarchies) the object marker is reinterpreted as marking topic-worthiness rather 
than topic-hood. (Topic-worthy objects are ones with semantic features that are high 
in the topicality hierarchies.) Topic-worthy objects come to be marked, whether they 
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are topical or nontopical in information structure or syntactic terms. Less topic-worthy 
objects are not obligatorily marked, even if they are topical. That is, topic-worthiness 
trumps both information structure and syntax in triggering the development of Bantu 
agreement-like object marking systems from purely anaphoric agreement systems. 
Iformalize these observations in terms ofthe syntactic and semantic constraints in (29). 
The syntactic ones are adapted from observations in work like Bresnan & Mchombo 
(1987), Morimoto (2002), Creissels (2006) concerning the distribution of object markers. 
The semantic ones are inspired by work like Aissen (2003), Dalrymple & Nikolaeva (2011), 
Iemmolo (2013; 2014) and Morimoto (2002) on the role of topic-worthiness in defining 


DOM.” 


(29) Constraints defining the development of DOM from a topic-marking system 
syntactic! 
a. "[Index,, NPL: 
Grammatical agreement with an overt in situ object nominal violates this con- 
straint, as object marking with an overt in situ object (if identical in form to 
anaphoric use of object marker) violates the condition that there should be 
only one expression of the object in the VP. 


b. MAX ARGUMENT/VP: 
Argument roles in the input VP must be realized overtly in the output VP (Mo- 
rimoto 2002). This constraint is violated if a topicalized object is not resumed 
with an object marker. 

semantic 

c. *gIndex[4TW]: 
Topic-worthy [+TW] objects should be indexed by object marking. Aissen 
(2003) 
(Topic-worthiness is defined by the topicality hierarchies in (8).) 

d. *Index[-TW]: 
Non-topic worthy [-TW] objects should not be indexed by object marking. 


Ranking the constraints in Optimality Theoretic style tableaux allows one to use a 
factorial typology to formalize the steps in the development of Bantu DOM systems 
and to formalize the relative importance of each constraint in defining stages along a 
grammaticalization path. 


14 As work since Comrie (1981; 1989) proposes, marking highly topic-worthy objects plausibly has a disam- 
biguating function, since nominals high in the topicality hierarchy are canonically subjects, rather than 
objects. See Witzlack-Makarevich & Seržant (2018 [this volume]) for further discussion. 

See Morimoto (2002) and van der Wal (2015) for recent proposals formalizing the agreement-anaphor con- 
tinuum for Bantu object marking in theoretical syntax frameworks. It is beyond the scope of this paper to 
critique these formal alternatives. 

16 As a reviewer points out, the combined syntactic constraints in (29a), (29b) bear a resemblance to the Theta 
Criterion in generative grammar Chomsky (1981): “Each argument bears one and only one theta-role, and 
each theta-role is assigned to one and only one argument.” 
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4.3.1 Stage I: purely anaphoric use of OM 


At Creissels's (2006) and Dalrymple 8 Nikolaeva's (2011) initial stage, object markers 
have a purely anaphoric function: object markers resume co-referential clause-external 
objects. This is optimal if the syntactic constraints in (29) conditioning the distribution 
of object marking outrank the semantic constraints, as shown in Tableau (30b), using 
schematized syntactic structures. 


(30) a. *[Index;, NPL » Max ARG(UMENT)/VP » *gIndex[+TW] » *Index[-TW] 
*[Index;, NP;]yp | Max ARG/VP | *eIndex[*TW] | *Index[-TW] 


i | 1.NP,[S[V-OM] 


2. NP; [S [V- ei] * S 
em 13. [S[V-OM,NPJ |] 4 YA 
4. [S [V NPj] 3 


Object marking is optimal when an object NP is dislocated: this is shown by can- 
didate (30b)-1. Omitting object marking to resume the dislocated object, as in candi- 
date (30b)-2, violates MAx ARG(UMENT)/VP, the constraint requiring an overt realization 
of the object within the VP. This ranking of the constraints defines agreement as non- 
optimal, however. As we can see, candidate (30b)-3, with a coreferential object marker 
resuming an object within the VP, violates *[Index;, NP; Le, 


4.3.2 Step 1 in the development of DOM 


The first step in the development of a DOM system involves spreading of object marking 
to non-topicalized objects which are semantically topic-worthy [+TW]. This becomes 
optimal when the semantic constraint requiring marking of [+TW] objects (29) comes 
to outrank the two syntactic constraints (29a)-(29b); re-ranked constraints are bolded: 


(31) DOM of in situ objects is optimal with ranking, 
*gIndex[+TW] » *[Index;, NPj]yp » MAX ARG(UMENT)/VP » *Index[-TW] 


Swahili exemplifies this kind of Bantu object marking system. Recall from 82.2 that in 
Swahili, we find object marking with all topicalized objects and grammatical agreement 
only with [+TW] objects. Tableaux (32) exemplify this next step in the DOM grammati- 
calization path. 


(32) a. Object NP is [TW] 


*eIndex[4TW] | *[Index;, NP;],p | Max ARG/VP | *Index[-TW] 
" | 1. NP; [S [V-OM;] 
2. NPi [S [V- ei] *l li 
" | 3.[S[V- OM; NP;] * 
4. [S [V NP] m 


60 


2 Differential object marking in Chichewa 


b. Object NP is [-TW] 


*gIndex[4TW] | “[Index;, NP;],p | Max ARG/VP | *Index[-TW] 
i |1 NP; [S[V-0M] + 
2. NP, [S [V- ei] RN 
ij |4. [S [V NP;] 


Tableaux (32a) and (32b) demonstrate that the anaphoric use of object marking re- 
mains optimal both when a [+TW] object is topicalized and when a [-TW] object is top- 
icalized. This context is illustrated by candidates (32a)-1 and (32b)-1. Candidate (32a)-3 
shows that when the semantic constraint *øIndex[+TW] is high ranked, object marking 
in the agreement context is optimal for a [TW] object. However, object marking re- 
mains non-optimal in the agreement context for a [TW] object, as shown by candidate 


(32b)-3. 


4.3.3 Step 2: modern colloquial Chichewa 


As noted above, it is problematic to account for DOM in modern colloquial Chichewa 
using Dalrymple & Nikolaeva's (2011) grammaticalization paths, as we find both spread- 
ing of marking (to non-topical topic-worthy objects) and narrowing of marking from 
non-topic worthy topicalized objects. The constraint-based approach developed here 
can straightforwardly formalize this second step by ranking the second semantic con- 
straint (29) higher than the second syntactic constraint (29b): 


(33) *sIndex[+TW] » *[Index;, NP;],p » "Index[-TW] » Max ARG(UMENT)/VP 
This is exemplified in Tableaux (34): 
(34) a. Object NP is [+ TW] 


*ølndex[+TW] | *[Index;, NP;],p | *Index[-TW] | Max arc/VP 


ee | 1. NP; [S [V-OMj] 


2. NP; [S [V- øi] ji i 
e | 3, [S [V- OM; NP] S 
4. [S[VNB;] i 


b. Object NP is [-TW] 


*gIndex[*TW] | *[Index;, NP;j];p | *Index[-TW] | Max ARG/VP 


ee | 1. NP; [S [V-OMj] *| 

2. NP; [S [V- ei] à 
e C3. [S TV- OM; NP, a E 

4. [S[VNB;] 


Tableaux in (34a) and (34b) show that with this constraint ranking, anaphoric use of ob- 
ject marking is only optimal when a [+TW] object NP is dislocated: candidate (34a)-1. 
Candidate (34b)-1, with object marking on a dislocated [-TW] object violates the seman- 
tic constraint, "Index[-TW]. Similarly, object marking is also optimal in the agreement 
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context only with a [+TW] object: candidate (34a)-3. Object marking on a [-TW] object 
in the agreement context, candidate (34b)-3, violates the syntactic constraint, "[Index;, 
NP; Jyp” 


4.3.4 Accounting for gaps 


A further advantage of this constraints-based approach is that it can account for gaps 
in the cross-Bantu object marking data. As noted above, we do not find the simple nar- 
rowing of marking of topicalized object which Dalrymple & Nikolaeva (2011) propose as 
an alternative grammaticalization path, as schematized in (28). Indeed, we noted that if 
DOM in Bantu languages results from change along an anaphor-agreement continuum, 
we do not expect simple narrowing, and we would want to account for this. What I pro- 
pose is that this direction of change falls out if the two syntactic constraints (35a) and the 
two topicality-sensitive constraints (35b) have the harmonic alignment rankings shown 
in (35): 


(35) Harmonic alignment 
a. *[Index;, NP;];p » MAX ARG(UMENT)/VP 
b. *sIndex[+TW] » *Index[-TW] 


A harmonic alignment ranking cannot be reordered to define a typology (Aissen 2003; 
Morimoto 2002). As we can see in Tableaux in (37), narrowing without spreading (cf. (27) 
and (28), above) is only optimal given the ranking in (36), which violates the harmonic 
ranking of the semantic constraints defined in (35b). 


(36) *[Index;, NP¡]yp , "Index[-TW] » Max ARG(UMENT)/VP » *gIndex[+TW] 


(37) a. Object NP is [+TW] 


*[Index;, NPj]yp , “Index[-TW] | Max ARG/VP | *eIndex[4TW] 
ee | 1 NP; [S[V-OM] i 


2. NP; [S [V- øi] = S 
3. [S [V- OM; NEj] * 
e| 4. [S [V NPj] E 


b. Object NP is [-TW] 


*[Index;, NPj]yp , “Index[-TW] | Max ARG/VP | *oIndex[+TW] 


1. NP; [S [V-OMj] i 
ee | 2.NP; [S [V- ei] S 
3. [S [V- OM; NP; * ü 


œ |4. [S [V NPj] i 


Comparing the first candidates in Tableaux (37a) and (37b) allows one to see that this 
constraint ranking optimizes narrowing. Anaphoric use of object marking is optimal 


U As a reviewer points out, the analysis developed here does not account for the variation we find in Chi- 
chewa. Object marking is possible with all dislocated objects, even non-topic-worthy ones. The DOM re- 
striction is therefore a tendency, not an absolute. How best to formalize this variation is a topic for future 
research. 
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only when a [+TW] object is dislocated, as in candidate (37a)-1, but not when a [-TW] 
object is dislocated, as in candidate (37b)-1. Note that candidate (37b)-1 violates the se- 
mantic constraint, "Index[-TW]. Object marking in the agreement context is not opti- 
mal, whether the object is topic-worthy or not, as this violates the syntactic constraint, 
*[Index;, NP; ]yp. Candidates (37a)-3 and (37b)-3 illustrate this. While this ranking clearly 
can define narrowing, it violates the harmonic ranking of the semantic constraints. Fi- 
nally, the constraints-based approach can explain why Creissels (2006) says he finds no 
examples of his Stage III: a purely grammatical agreement system for object marking 
which ignores the topicworthiness of the object. To make this kind of agreement system 
optimal, we must introduce a new semantic constraint, *eIndex[ TW], which clearly 
contradicts the better motivated constraint: “Index[-TW]. This new constraint, highly 
ranked, optimizes object marking on both topicworthy and non-topicworthy objects in 
the agreement context: 


(38) “gIndex[+TW] » *eIndex[-TW] » *[Index;, NP;],p » MAX ARG(UMENT)/VP 
» "Index[-TW] 


However, as Tableau (39) exemplifies, this same ranking cannot define Creissels's 
(2006) Stage III, because it incorrectly optimizes object marking to resume topicalized 
objects: 


(39) Object NP is either [TW] or [-TW] 


*ølndex[+TW] | *olndex[-TW] | *[Index;, NPL | Max AnG/VP | "Index[-TW] 


ee | a. NP; [S [V-OMj] 


b. NP; [S [V- ei] "I R 2 7 
wx | c. [S[V- OM; NP] 7 
d. [S [VNE;] * 7 


Tableau (39) shows that these constraints and this ranking optimize agreement with any 
co-referential object, whether topicalized (candidates (39)-a and (39)-b) or in a grammat- 
ical agreement context (candidates (39)-c and (39))-d. Stage III, therefore, is not found 
because it is not optimal under any ranking of the proposed constraints that define a 
grammaticalization path leading to a DOM system. 


5 Conclusion 


As we have seen, object markers are not “purely anaphoric” in modern colloquial Chi- 
chewa. They are also not pure agreement markers, as they occur only variably (not 
obligatorily), and they only co-occur with clause-internal human objects. Rather, their 
distribution conforms to Bentley's (1994), Morimoto's (2002), Riedel's (2009), Marten & 
Kula's (2012)'s and van der Wal's (2015) observation that the occurrence of grammati- 
cal agreement-like object markers in Bantu languages is conditioned by the hierarchies 
in (8). As a result, in Chichewa, as in many Bantu languages, we find a DOM system. 
Following Iemmolo (2013; 2014) I have proposed that the grammaticalization path to- 
wards DOM is for object markers to come to index not just topic-hood (an information 
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structural and/or syntactic property) but also topic-worthiness (a semantic property). In 
Chichewa, as I have shown, topic-worthiness is quite systematically indexed. This ob- 
servation forms the basis for a constraints-based account of the development of DOM 
in Bantu languages, which improves on Creissels (2006) by incorporating the notion 
of topic-worthiness as a trigger for the movement from anaphoric agreement to gram- 
matical agreement. It improves on Dalrymple & Nikolaeva (2011) by providing a way of 
formalizing the anaphor-agreement continuum that is central to the discussion of the 
development of DOM in Bantu languages. It is hoped this proposal provides a useful ba- 
sis for a more comprehensive study of the DOM properties of object marking in Bantu 
languages. 
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Abbreviations 

1 first person LOC locative 

2 second person OBJ object 

3 third person PL plural 

CL noun class concord affixes POSS possessive 
(e.g. cl1, cl2, etc.) PRF perfect 

COP copula PROG progressive 

DEM demonstrative PST past 

EMPH emphasis Q question marker 

FUT future REL relative 

FV final vowel SBJ subject 

HAB habitual SG singular 

INF infinitive 


64 


2 Differential object marking in Chichewa 


References 


Aissen, Judith. 2003. Differential object marking: Iconicity vs. Economy. Natural Lan- 
guage and Linguistic Theory 21(3). 435—483. 

Bax, Anna & Michael Diercks. 2012. Information structure constraints on object marking 
in Manyika. Southern African Linguistics and Applied Language Studies 30(2). 185-202. 

Bearth, Thomas. 2003. Syntax. In Derek Nurse & Gérard Philippson (eds.), The Bantu 
languages, 121-142. London: Routledge. 

Beaudoin-Lietz, Christa, Derek Nurse & Sarah Rose. 2004. Pronominal object marking 
in Bantu. In Akin Akinlabi & Oluseye Adesola (eds.), Proceedings of the 4th World 
Congress of African linguistics, New Brunswick 2003, 175-188. Cologne: Kóppe. 

Bentley, Mayrene. 1994. The syntactic effects of animacy in Bantu languages. Bloomington, 
IN: Indiana University dissertation. 

Bresnan, Joan & Sam Mchombo. 1987. Topic, pronoun, and agreement in Chichewa. Lan- 
guage 63(4). 741-782. 

Buell, Leston Chandler. 2005. Issues in Zulu verbal morphosyntax. Los Angeles: University 
of California at Los Angeles dissertation. 

Cheng, Lisa Lai-Shen & Laura J. Downing. 2009. Where's the topic in Zulu? The Linguistic 
Review 26(2-3). 207-238. 

Cheng, Lisa Lai-Shen & Laura J. Downing. 2012. Against FocusP: Evidence from Durban 
Zulu. In Ivona Kučerová & Ad Neeleman (eds.), Contrasts and positions in information 
structure, 247-266. Cambridge: Cambridge University Press. 

Cheng, Lisa Lai-Shen Laura J. Downing. 2016. Phasal syntax = cyclic phonology? Syn- 
tax 19(2). 156-191. 

Chomsky, Noam. 1981. Lectures on Government and Binding. Dordrecht: Foris. 

Comrie, Bernard. 1981. Language universals and linguistic typology. Chicago: University 
of Chicago Press. 

Comrie, Bernard. 1989. Language universals and linguistic typology. 2nd edn. Chicago: 
University of Chicago Press. 

Creissels, Denis. 2006. A typology of subject and object markers in African languages. 
In F. K. Erhard Voeltz (ed.), Studies in African linguistic typology, 43-70. Amsterdam: 
John Benjamins. 

Dalrymple, Mary & Irina Nikolaeva. 2011. Objects and information structure. Cambridge: 
Cambridge University Press. 

Downing, Laura J. 2010. Prosodic phrasing in relative clauses: A comparative look at Zulu, 
Chewa and Tumbuka. In Karsten Legére & Christina Thornell (eds.), Bantu languages: 
Analyses, description and theory, 17-29. Cologne: Riidiger Kóppe. 

Downing, Laura J. & Larry M. Hyman. 2016. Information structure in Bantu languages. 
In Caroline Féry & Shinichiro Ishihara (eds.), Oxford handbook of information structure, 
790-813. Oxford: Oxford University Press. 

Downing, Laura J. & Al Mtenje. 2011. Prosodic phrasing of Chichewa relative clauses. 
journal of African Languages and Linguistics 32(1). 65-111. 


65 


Laura J. Downing 


Duranti, Alessandro. 1979. Object clitic pronouns in Bantu and the Topicality Hierarchy. 
Studies in African Linguistics 10(1). 31-45. 

Duranti, Alessandro & Ernest Rugwa Byarushengo. 1977. Haya grammatical structure. In 
Ernest Byarushengo, Alessandro Duranti & Larry M. Hyman (eds.), Southern california 
occasional papers in linguistics, vol. 6, 45-71. Department of Linguistics, University of 
Southern California. 

Givón, Talmy. 1976. Topic, pronoun, and grammatical agreement. In Charles N. Li (ed.), 
Subject and topic, 149-188. New York: Academic Press. 

Heine, Bernd. 1976. A typology of African languages. Berlin: Dieter Reimer. 

Henderson, Brent M. 2006. The syntax and typology of Bantu relative clauses. Urbana: 
University of Illinois at Urbana-Champaign dissertation. 

Hyman, Larry M. & Alessandro Duranti. 1982. On the object relation in Bantu. In San- 
dra A. Thompson & Paul J. Hopper (eds.), Studies in transitivity, 217-239. New York: 
Academic Press. 

Hyman, Larry M. & Anne Hawkinson. 1974. Hierarchies of natural topics in Shona. Stud- 
ies in African Linguistics 5(2). 147-170. 

Iemmolo, Giorgio. 2013. Symmetric and asymmetric alternations in direct object encod- 
ing. STUF — Language Typology and Universals 66. 378-403. 

Iemmolo, Giorgio. 2014. Differential object marking: An overview. University of Zurich 
Ms. 

Kanerva, Jonni. 1990. Focus and phrasing in Chichewa phonology. New York: Garland 
Publishing. 

Kisseberth, Charles W. & Mohammad Imam Abasheikh. 1977. The object relationship in 
Chi-Mwi: Ni, A Bantu language. In Peter Cole & Jerrold M. Sadock (eds.), Grammatical 
relations, 179-218. New York: Academic Press. 

Marten, Lutz & Nancy Kula. 2012. Object marking and morphosyntactic variation in 
Bantu. Southern African Linguistics and Applied Language Studies 30(2). 237-253. 

Mchombo, Sam. 2004. The syntax of Chichewa. Cambridge: Cambridge University Press. 

Meeussen, Achille E. 1967. Bantu grammatical reconstructions. Africana Linguistica 3. 
79-121. 

Morimoto, Yukiko. 2002. Prominence mismatches and differential object marking in 
Bantu. In Miriam Butt & Tracy Holloway King (eds.), The Proceedings of the LFG'02 
Conference, 292-314. Stanford, CA: CSLI Publications. 

Nurse, Derek. 2003. Aspect and tense in Bantu languages. In Derek Nurse & Gérard 
Philippson (eds.), The Bantu languages, 90-102. London: Routledge. 

Riedel, Kristina. 2009. The syntax of object marking in Sambaa: A comparative Bantu per- 
spective. Leiden: Universiteit Leiden dissertation. 

Schadeberg, Thilo. 1995. Object diagnostics in Bantu. In Emmanuel 'Nolue Emenanjo 
& Ozo-Mekuri Ndimele (eds.), Issues in African languages and linguistics: Essays in 
honour of Kay Williamson. Special issue of The Nigerian Language Studies, 173-180. Aba: 
National Institute for Nigerian Languages. 


66 


2 Differential object marking in Chichewa 


Seidl, Amanda & Alexis Dimitriadis. 1997. The discourse function of object marking in 
Swahili. In Kora Singer, Randall Eggert & Gregory Anderson (eds.), Proceedings of the 
33rd Annual meeting of the Chicago Linguistics Society, 373-389. 

Stucky, Susan. 1981. Word order variation in Makua: A phrase structure grammar analysis. 
Urbana: University of Illinois at Urbana-Champaign dissertation. 

van der Spuy, Andrew. 1993. Dislocated noun phrases in Nguni. Lingua 90(4). 335-355. 

van der Wal, Jenneke. 2009. Word order and information structure in Makhuwa-Enahara. 
Leiden: University of Leiden dissertation. 

van der Wal, Jenneke. 2015. Bantu object clitics as defective goals. Revue roumaine de 
linguistique, LX 2-3. 277-296. 

Witzlack-Makarevich, Alena & Ilja A. SerZant. 2018. Differential argument marking: 
Patterns of variation. In Ilja A. Serzant & Alena Witzlack-Makarevich (eds.), Di- 
achrony of differential argument marking, 1-40. Berlin: Language Science Press. 


Zeller, Jochen. 2012. Object marking in isiZulu. Southern African Linguistics and Applied 
Language Studies 30(2). 219-235. 

Zerbian, Sabine. 2006. Expression of information structure in the Bantu language Northern 
Sotho. Berlin: Humboldt University of Berlin dissertation. 


67 


Chapter 3 


The evolution of differential object 
marking in Alor-Pantar languages 


Marian Klamer 


Leiden University 


Frantisek Kratochvíl 


Palacky University, Olomouc 


This paper investigates the evolution of Differential Object Marking (DOM) in Abui and 
Teiwa, two Papuan languages of the Alor-Pantar family in Eastern Indonesia. In both lan- 
guages, reflexes of the same proto-morpheme are used in the differential marking of P (the 
non-agentive argument in transitive constructions), but the languages contrast in the way 
Ps are differentiated. We compare the synchronic DOM patterns of Abui and Teiwa with 
each other as well as with the DOM patterns we reconstruct for their shared ancestor. We 
establish how different patterns of DOM in this family have evolved over time, and which 
semantic and morphological changes occurred in the process. 


In their morphological expression, there are two strategies by which P's are differentiated: 
(i) the asymmetrical strategy involves an opposition between P as either a verbal prefix or 
a free nominal, and (ii) the symmetrical strategy where the choice of a P-prefix is variable 
depending on the semantics of P. Both strategies are used in both Teiwa and Abui, but the 
symmetrical strategy involves a choice between two different prefixes in Teiwa and five 
different prefixes in Abui. 


Different factors trigger DOM in both languages: in Teiwa it is mostly based on the inherent 
properties (animacy) of P, while in Abui there are many other triggers besides the animacy 
of P, including the affectedness relation between the action and the P referent and the in- 
flectional class of the verb. Furthermore, Abui has developed an extra, third, formal strategy 
to differentiate human Ps from non-human ones in a serial verb construction. 


The alignment system we reconstruct for the proto-language was semantic. It evolved into 
an accusative alignment system in Teiwa, but was retained and further complexified in Abui. 
Alignment systems are not static: their forms and triggers may be modified and complexified 
over time. 


Marian Klamer & Frantisek Kratochvíl. The evolution of differential object marking 
Mil in Alor-Pantar languages. In Ilja A. Seržant & Alena Witzlack-Makarevich (eds.), 


Diachrony of differential argument marking, 61-85. Berlin: Language Science Press. 
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1 Introduction 


This paper describes and compares the differential object marking in Teiwa (Klamer 
2010a) and Abui (Kratochvíl 2007; 2014a; Kratochvíl & Delpada 2015b), two members 
of the AP language family of Papuan! languages spoken in eastern Indonesia (Figure 1- 
3). We show that different members of a language family may show different patterns 
of Differential Object Marking (DOM) that are triggered by different factors and involve 
different forms, and that the evolutionary path of DOM has both stable and unstable 
features. 


Australia 


Figure 1: The islands of Timor, Alor and Pantar in Indonesia 


After an introduction to the history and typology of the Alor-Pantar (AP) language 
family (81), we present evidence that Proto-AP (the ancestor language of Teiwa and Abu) 
treated both transitive objects (P) and intransitive subjects (S) in a split fashion, and we 
list the morphological forms involved in the proto-splits ($2). In $3, we describe the for- 
mal and semantic characteristics of DOM in Teiwa, pointing out the elements of the 
proto-DOM system that have been retained, changed and lost in Teiwa. In $4, we simi- 
larly describe DOM in Abui and compare it to the proto-system. 

By studying patterns of DOM in these two related languages and comparing them with 
their shared ancestor, we can establish how different patterns of DOM evolve over time, 
and which semantic and morphological changes occur in the process. For the descriptive 
data presented in this paper, we build on our own publications on Teiwa and Abui, as well 
as unpublished fieldwork data included in the respective corpora of Teiwa and Abui.? 


INote that the term ‘Papuan’ is not a genealogical term, but rather refers to a cluster of several dozens of 
unrelated language families that are spoken on or close to the Papuan mainland, and are not Austronesian. 

“These corpora are available as part of the Laiseang corpus in The Language Archive (TLA) at the Max 
Planck Institute for Psycholinguistics in Nijmegen http://tla.mpi.nl. 
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Language Family 
E) Timor-Alor-Pantar 
Austronesian 


e 


Wersing 
dia 


Kiramang 


Pantar Han 


© Owen Edwards 2017 


Figure 2: The Papuan languages of Timor (in the areas that are left white, Aus- 
tronesian languages are spoken) 


Oirata 


Makalero 


Language Family 
EH Timor-Alor-Pantar 
Austronesian 


© Owen Edwards 2017 


Figure 3: The languages of Alor and Pantar 
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For the typological component of the paper, we have used information on argument en- 
coding in the AP languages that has been published elsewhere (e.g. Klamer 2010b,c,2017; 
Kratochvil 2011, 2014a; Klamer & Kratochvil 2012; Klamer & Schapper 2012; Fedden et al. 
2013; 2014, Kratochvil & Delpada 2015a; 2015b). For the historical reconstruction of the 
DOM system in Proto-AP, we draw on published historical reconstruction work on the 
AP family (Holton et al. 2012; Holton & Robinson 2014; 2017). 


2 Introduction to the history and typology of Alor-Pantar 
languages 


Together with the Papuan languages spoken on the neighbouring island of Timor, the AP 
sub-family constitute the larger Timor-Alor-Pantar family counting about 30 languages 
(Figure 2-4) (Holton et al. 2012; Holton & Robinson 2014; 2017; Robinson & Kratochvil 
2014; Schapper 2014; Schapper et al. 2017). An indication of the position of Teiwa and 
Abui in the Timor-Alor-Pantar family tree is shown in Figure 4. Based on phonological 
innovations (Holton et al. 2012), we assert that Teiwa and Abui share a common ancestor, 
Proto-AP, but are not direct sister languages, as it is possible to construct an intermediate 
node (labelled Proto-Alor in Figure 4) between Teiwa and Abui. 


Proto-Timor-Alor-Pantar 


e e" 


Proto-Alor-Pantar Proto-Timor 


Proto-Alor 


uU 


Teiwa oo... .. an ... Abui ... 


Figure 4: The position of Teiwa and Abui in the Timor-Alor-Pantar family tree 
(derived from Holton et al. 2012: 114, Fig. 2). 


Basic (pragmatically unmarked, declarative) transitive clauses in the AP languages are 
verb-final, and Agent-Patient-Verb (APV) and Subject-Verb (SV) is the basic constituent 
order attested in all the modern languages.? Objects in AP languages are expressed with 
free nominal constituents (NPs or pronouns), which exist alongside verbal affixes that 
index person and number of verbal arguments. The AP languages are all head-marking 
and show a preponderance to index P over S/A (Klamer 2017: 20). This pattern is typolog- 
ically extremely rare, occurring in only 7% of the 378 languages surveyed by (Siewierska 
2013), yet it is universally found in the AP family. In other words, in AP, a person-number 


3The notions A, S and P are used here as comparative concepts, where A is the most Agent-like argument of 
a transitive clause and P the least Agent-like, while S is the single argument of an intransitive verb (Comrie 
1989). 
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prefix on a verb typically indexes the object (P), while subjects (S/A) may also be indexed 
but are more typically expressed as free forms (pronouns or NPs). 

Differential Object Marking (DOM) is seen here as ‘the non-uniform grammatical 
marking of objects which occurs within one and the same language, with objects of one 
and the same verb' (Dalrymple & Nikolaeva 2011: 1). The grammatical marking of objects 
in AP languages involves differential patterns of object indexing on verbs (Iemmolo 2011), 
and in this respect is crucially different from differential marking of arguments by case 
marking on the noun phrase. In the AP family, nouns are never marked for case, and 
alignment is always defined relative to the pronominal indexing of the verb. 

Other crucial differences between the AP languages and the well-known European 
languages include the following. First, AP languages have few, if any, tri-valent (ditran- 
sitive) verbs. Instead of having a predicate with three arguments, two of which are object- 
like, the languages use a strategy where serial verb constructions express events which 
involve more than two participants. Second, the object (P) of a bi-valent verb in AP 
languages can express a multitude of semantic roles: a P may be a semantic patient, re- 
cipient, goal, benefactive, or source. This is illustrated for Teiwa in (all where P isa 
patient; in (1b), where the P of bi-valent -an is a recipient, in (1c), where the P of -mian 
‘put at’ is a goal; in (1d), where the P of -lal ‘show’ is a benefactive, and in (1e), where the 
P of -umbangan “ask (something) from someone” is a source. Similar observations can be 
made for Abui, see (10a)- (10e) below. 


(1) Teiwa (Klamer 2010a: 114, 169, 334-335, fieldnotes, TSS: 001) 


a. Sematar na h-ua’. 
in.a.moment(IND) 1sG 2sc-hit 
"TI hit you!’ 
b. Uy gaan u sen ma n-oma’ g-an. 


person DEM DIST money come  1sc.Poss-father 3sc-give 
"Ihat person gives my father money? 


c. Jadi hala biar kriman la pin aria” ma ni-mian... 
so others children small Foc hold arrive come  1PL-putat 


‘So other people brought some small children here and gave them to us... 


d. Yitar ga-qau ma na-lal-an. 
road  3sc.POss-good come  1sc-show-REAL 


“[You] show me the right way: 


e. A daa n-um-bangan. 
3sG ascend  1SG-APPL-ask.for 


“He comes up to ask [sth.] from me’ or ‘He comes up to ask me [for/about 


sth.] 


^Orthographic conventions used in this article: x = /h/, q = /q/, ' 2/?/, and a double vowel symbol stands for 
a long vowel. 


73 


Marian Klamer & Frantisek Kratochvíl 


Note that in (1b), (1c) and (1d) the theme participants (sen “money”, biar kriman “small 
children’, yitar gaqau ‘right way”) are introduced with a separate verb (ma ‘come’).° This 
verb occurs in a serial verb construction with a second verb in clause final position. 
The second verb carries the P-prefix. Homologous affixes combine with nouns to index 
possessors: examples include n-oma ‘1sc.poss-father’ in (1b) and ga-qau '3sc.Poss-good' 
in (1d). 


3 Differential object marking in Proto-Alor-Pantar 


Pronouns and pronominal indexes are known to belong to the most stable and archaic 
part of the lexicon (Filimonova 2005; Heine & Song 2011a,b). Given their stability, pro- 
nouns have been used to suggest deep genetic relationships (Nichols & Peterson 2013). 
The morpho-syntactic patterns attested in the modern AP languages regularly involve 
morphemes reflecting forms that are reconstructable up to the ancestor language of the 
family, Proto-AP. 

Table 1 lists the reconstructed pronoun forms (Holton et al. 2012; Robinson & Kra- 
tochvíl 2014; Holton & Robinson 2017: 170). In AP pronouns, initial consonants encode 


5This function of Teiwa ma is further described in Klamer (2010a,b). 
SExample (1c) involves another serial verb (pin aria 'arrive holding something’). We will not discuss serial- 
ization in Teiwa or Abui here; see the respective grammars for further information. 


Table 1: Reconstructed forms for A, P, and Possessor in Proto-Alor-Pantar 


A free pronoun P prefix Possessor prefix 

1sG "na(N)* *na- 

2sc "a(N) "(h)a- 

3 "ga(N) *ga-" "ge" 

DISTR "ta- 

1PL.INC "pi(N) *pi- 

1PL.EXC *ni(N) *ni- 

2PL *i(N) *(h)i-4 


^N represents a nasal unspecified for place. 

"Holton & Robinson (2017) reconstruct two separate third person prefixes, of which the singular is *ga- and 
the plural *gi-. 

“Proto-AP may also have had possessor prefixes for other persons but only the third person form is recon- 
structed so far. Possible reconstructed forms would be *ne- ‘Isc’, "(h)e- ‘2sc’, *te- “DisTR”. In the plural, the 
vowel distinction was likely neutralized. 

“Robinson & Kratochvíl (2014) do not reconstruct the initial consonant of this prefix as optional, because of 
the regular reflex of Proto-AP *h in Western Pantar and Sar. 
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person features, while theme vowels encode number features (/a/ singular, /i/ plural) and 
possession (/e/).” 

In addition to reconstructing the form of the Proto-AP prefixes we can also reconstruct 
some of the Proto-AP bi-valent verbs as bound forms, and others as unbound. We recon- 
struct a verb as bound when that has a P-prefix in daughter languages across the family, 
while a verb is reconstructed as unbound when all its modern reflexes lack a P-prefix. 
The reconstructed verbs are given in Table 2. 


Table 2: Reconstructed bi-valent verbs in Proto-Alor-Pantar (Holton et al. 2012; 
Holton & Robinson 2017; Schapper et al. 2017; Klamer in press). 


With P-prefix Without P-prefix 
Proto-AP verb Meaning Proto-AP verb Meaning 
“ten wake up someone "tapai pound, pierce 
*-wel bathe someone *mi be in, be at 
“ena give to someone "magi hear 
*-asi bite someone (of dogs) — "(ta)ki bite (food?) 
"nai eat 
"med take 
"kabar scratch 
*tiari(n) close? 


^Holton & Robinson (2017: 75) reconstruct 'close' with a prefix. We find no evidence for this in a larger 
dataset. 


In other words, Proto-AP encoded its Ps in a split fashion: certain verbs indexed P us- 
ing a pronominal prefix, other verbs used (only) a free form to express P. Even with the 
limited evidence these verbs provide us with, it is already possible to see that this split 
in P-marking probably had a semantic motivation. For the reconstructed verbs with a 
P-prefix, the prefix likely indexed a human/animate referent, as waking up and bathing 
someone are activities applied to a human object. Also, across the AP family, the (sin- 
gle) object of the verb ‘give’ is always a human referent (the P-prefix always indexes 
a recipient), while the theme (-the thing given) is encoded as either a separate oblique 
constituent or with its own predicate, using a serialization strategy (Klamer & Schapper 
2012). 

In contrast, the verbs that are reconstructed without a P-prefix such as ‘be in, be at’, 
eat’, and ‘take’ seem to typically have an inanimate P. The object of the verb ‘scratch’ is 
typically a surface (which may or may not be a human skin). The verb ‘pound’ typically 


‘ 


TProto-AP *ta- is grouped with the singular forms in Table 1 because it carries the singular theme vowel /a/. 
“ta has a common or impersonal referent (cf. one in English ’One should consider this’), and its reading is 
often distributive or reflexive (‘each one’, ‘each other’). 
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refers to pounding food objects (e.g. rice or corn). The two verbs for “bite” may have 
been split in use depending on the animacy of the object. And in the AP languages, 
the verb ‘hear’ does not typically take a personal object (as in I heard your father sing) 
but rather a sound or a sound-producing event (e.g. Your father's singing, I heard it). 
In sum, Proto-Alor Pantar had a split in the marking of P, and this split was probably 
motivated by the distinction between human/animate objects (which were indexed with 
a verbal prefix) versus inanimate objects (which were expressed as free constituents). The 
fact that the feature 'human/animacy' triggers the indexing of Ps is cross-linguistically 
not unusual: agreement is often sensitive to the discourse salience of arguments, and 
since humans/animates have more discourse prominence than inanimates they are more 
eligible to be indexed on verbs (cf. Dalrymple & Nikolaeva 2011). 

In addition to a split P-marking, the proto-language may also have had a split in the 
marking of intransitive subjects (S) that was based on semantics (Klamer 2012; Robinson 
& Kratochvil 2014); a system referred to in the literature as “semantic alignment’ (Mithun 
1991; Donohue & Wichmann 2008), in contrast to 'accusative alignment' or 'ergative 
alignment’. Languages with accusative alignment treat S and A alike, as opposed to P; 
languages with semantic alignment encode S sometimes like P (by prefixing it to the verb, 
asin the AP languages), and sometimes like A (eg by expressing it as a free pronoun, as in 
the AP languages). The variable encoding of S is motivated by the semantics of the verb 
and its argument, but the lexical sub-categorisation characteristics of verbs also play a 
role (cf. Fedden et al. 2013; 2014). 

The hypothesis that Proto-AP had semantic alignment is based on the following obser- 
vations.? First, AP languages with semantic alignment are found across the region, while 
languages with accusative alignment are confined to a region in the centre, as shown in 
Figure 5. This geographical spread suggests that semantic alignment was the original 
pattern from which the accusatively aligning languages diverged. 

Second, some languages that today have accusative alignment show morphological 
traces of semantic alignment. An example is Kaera (Pantar), which encodes the S of 
certain intransitive verbs with a prefix otherwise typically used to index P arguments 
(Klamer 2014: 135-136). This Kaera class of verbs includes verbs such as ‘live’, ‘be silent’, 
‘jump up’, ‘faint, be unconscious’, ‘think’, “give birth'? The presence of such morpholog- 
ical fossils suggests that there may have been an earlier historical stage with semantic 
alignment from which modern Kaera with accusative alignment has developed. 

Third, some languages that are accusatively aligning today are still attuned to seman- 
tic factors in the alignment of P. Examples are Adang (Haan 2001; Robinson & Haan 
2014) or Blagar (Steinhauer 2014). This sensitivity to semantics in an otherwise accusative 


$To reconstruct the alignment system of Proto-Alor Pantar with confidence, comparative data from cognate 
sets of a sizable number of verbs across a wide range of Alor Pantar languages need to be collected and 
their alignment patterns compared, work that yet needs to be done. 

? Although the coverage of our comparative database is currently insufficient to determine whether the 
Kaera forms are regularly inherited from the Proto-AP lexicon, verbs with the similar senses regularly 
either allow or require S-indexing in semantically aligned languages such as Western Pantar (Holton 2014), 
Klon (Baird 2008), Abui (Kratochvil 2007; 2011), Kamang (Schapper 2014), Sawila (Kratochvil 2014b), and 
Wersing (Schapper et al. 2017). 
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Accusative 
Semantic 


Kiramang 


Figure 5: Semantic (green) and accusative (red) alignment in Alor-Pantar lan- 
guages. (For the language areas left white, information on alignment is lack- 
ing). 


alignment system suggests that the language developed from an earlier language with 
semantic alignment.? 

If Proto-Alor Pantar indeed had semantic alignment, then it must have expressed in- 
transitive S sometimes like A, using a free form, and sometimes like P, using a verbal 
prefix (compare Table 2). Some examples of reconstructed mono-valent verbs in Proto- 
AP are presented in Table 3.% 

We have not, or not yet, been able to reconstruct bound mono-valent verbs, i.e. verbs 
that encode their S argument with a prefix in their modern reflexes across the AP family. 
The evidence for the semantic alignment of Proto-AP is thus circumstantial. 

To summarize, the following grammatical information about Proto-AP, the ancestor 


language of Teiwa and Abui has been presented: 


1. The reconstructed pronouns include free and bound forms that are formally clearly 
related (cf. Table 1). 


2. In Proto-AP, free pronouns express A while bound pronouns typically express P 
and Possessor. 


3. Proto-AP has some kind of DOM, as Ps are expressed in a split fashion: some bi- 
valent verbs take a P-prefix, other bi-valent verbs express P with a free form. 


In Adang, objects are either indexed by prefixes on the verb or expressed by free object pronouns. There is 
a tendency for verbs with animate objects to be prefixing (Fedden et al. 2013). In Blagar, various degrees of 
affectedness can be distinguished using object pronoun, possessive pronouns, or a prefix (Steinhauer 2014: 
167, 189). 

"Holton & Robinson (2017: 75) reconstruct ‘close’ with a prefix. We find no evidence for this in a larger 
dataset. 
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Table 3: Reconstructed mono-valent verbs in Proto-Alor-Pantar (Holton et al. 
2012; Holton & Robinson 2017; Schapper et al. 2017; Klamer in press). 


Proto-AP verb Meaning 
"tas stand 

"tia sleep 
*purVn spit 

"jagir laugh 
"Iuk(V) crouch 
"mai come (here) 
"kabar scratch 
"tiari(n) close 


4. The P-split is likely based on the distinction between human/animate and inani- 
mate referents, where human/animate Ps are indexed on the verb and inanimate 
Ps are not. 


5. Proto-AP likely has semantic alignment, encoding the S of certain intransitive 
verbs with a prefix otherwise typically used to index P arguments. However, so 
far we have only been able to reconstruct mono-valent verbs with a free-standing 


S. 


4 Differential object marking in Teiwa 


In Teiwa, some of the Proto-AP properties listed above were retained, while others were 
lost. Teiwa retained both the proto-prefix for P (and some S) and the free proto-pronoun 
that encoded A (and some S). The full set of Teiwa pronouns and person prefixes encoding 
A, P, S, and the possessor is given in Table 4. (Using a long rather than a short free 
pronoun encodes contrastive focus of A and S in Teiwa.) As in Proto-AP, free pronouns 
express A while bound pronouns typically express P and Possessor. Unlike Proto-AP, 
Teiwa has no semantic alignment where S can be marked like P: Teiwa is completely 
accusative. 

As in Proto-AP, some bi-valent verbs in Teiwa take a P-prefix, while other such verbs 
express P with a free form. Teiwa bi-valent verbs typically use a prefix to index an ani- 
mate P, while a free form (pronoun or NP) expresses an inanimate P. This is illustrated in 
(2). In (2a),2 the object of mai is ha-gas qai ‘your younger sister’, an animate referent that 
is indexed on the verb. In (2b) the verbs mai ‘keep’ and usan ‘lift’ share a single object 
aga’ “all [of it], which is not indexed on the verb because the referent is inanimate. 


Compare Xa'a ma na-mai ‘this come Isc-keep.for' Keep this for me’ [constructed example]. 
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Table 4: Teiwa pronouns (S, A, P) and prefixes (P and possessor) (Klamer 2010a: 


77-78) 

A, S long A, S short P free P prefix Possessor 

pronoun pronoun pronoun prefix? 
1sG na'an na na'an n(a)- n(a)- 
2SG ha’an ha ha’an h(a)- h(a)- 
3sG a’an a ga'an g(a)-, gə- g(a)-, a- 
DISTR ta'an ta ta'an t(a)- t(a)- 
IPLINC niin ni ni in n(i)- n(i)- 
1PL.EXC piin pi pi'in pä- pä- 
2PL yi'in yi yi'in y(i)- y(i)- 
3PL iman, i’in ia iman, gi'in &(i)-, ga- &(i)-, a-, ga- 


“Possessors can also be expressed with short and long forms of free pronouns, see (Klamer 2010a: 79). Teiwa 
possessive prefixes contain the theme vowel /a/ just like the prefixes that index P. Alienable and inalienable 
possession are distinguished by the optional versus obligatory use of the possessive prefix na-yaf '1sc.Poss- 
house’ ‘my house(s)’ vs. yaf “a house, house(s)’; na-tan '1sc.Poss-hand' ‘my hand(s)’ vs. *-tan (intended 
reading ‘a hand, hand(s) ). 


(2) Teiwa (Klamer fieldnotes TAS:0055; TAS2012:001) 


a. 


Xa'a ma ha-gas qai ga-mai. 
this come  2sc.POss-youngersister 3sG-keep.for 
“Keep this for your younger sister. 


Aga’ usan kamar gom ma mai. 
all lift room(IND) inside come keep 


“Pick up all (of it) and keep (it) inside the room: 


Another example of an animate P that is indexed on the verb is given in (3a). It con- 
trasts with the P in (3b), which is inanimate and not indexed. A similar contrast is shown 
in (4), but here the free form is a pronoun rather than a lexical constituent. 


(3) Teiwa (Klamer fieldnotes TAS2011:138; TPV2011 2:016) 


a. 


Bif | g-oqai sen ma ga-mian. 
child 3sc.poss-child money come  3sG-putat 
“His child gave him money. 

In qap ii kalax gom mian. 

thing cut red basket inside putat 

“A red cloth is put inside a basket? 
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(4) Teiwa (Klamer 2010a: 91) 
a. Na  ga-mar. 
1sG 3sc-take 
‘I follow him/her 
b. Na ga'an mar. 
lsG  3sG take 
T take it? 


Some additional illustrations of Teiwa verb 


s that show DOM based on animacy are 


given in (5). These verbs are attested with both an animate and inanimate object in the 


Teiwa corpus. 


Without P-prefix 


(5) Illustrations of Teiwa transitive verbs showing DOM 
With P-prefix 
ga-mar “follow someone” 
ga-sii "bite someone' 
ga-dee “burn someone” 
ga-sar “notice, find someone” 


mar “take (something) 

sii “bite (into) (something) 
dee “burn (something) 

sar “notice, find (something) 


However, DOM in Teiwa is not completely predictable and regular, as there are also 
some verbs that index Ps which are not animate. First, the Teiwa corpus contains some 
examples of verbs whose prefix optionally indexes an animate or an inanimate referent. 
An example is uyan ‘search for’ in (6). Both (6a) and (6b) are grammatical, but in (6b) the 
indexed P has an inanimate referent (wat ‘coconut(s)’). In examples (6c)-(6e) the prefix 
on other verbs from the same class indexes inanimate referents: a tree, a coconut, and a 
spoon. 


(6) Teiwa (Klamer, fieldnotes TAS:0628, TC:025a, TTR2010:024; Klamer 2010a: 307) 
a. Na  n-ogai ga-uyan. 
1sG  1sc.Poss-child 3sG-search 
Tm looking for my child. 
b. Na wat ga-uyan. 
1sG coconut 3sG-search 


Tm looking for coconut(s). 


c. Burilak ga'an ma Sibari heer nuk  ga-sar. 
Clan name 3sG come k.o.tree stem one  3sG-notice 
"Ihe Burilaks noticed a Sibari tree’ 

d. ..uy quaf eran ta om qalixil ta’ a-fat mat 
person grandmother that TOP inside angry Top 3sc-foot take 
ma, wat u ga-tane' si... 
come coconut DIST 3sc-kick sw 


BTa marks switched topics, but here it functions as a clause-linking device. Its interclausal function may be 
characterized as marking the discontinuity or asymmetry of events in discourse (Klamer 2010a: Sec. 11.4). 
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‘that grandmother was angry and with (lit. taking) her foot kicked that 
coconut, then... 


e. Sii ga'an in qap ga-tiri ba ga-wa’ la a'an 
spoon DEM thing cut 3sG-float sEQ 3sc.poss-leaf roc  3sc 
dagar. 
be.visible 


‘That spoon is covered by a cloth so that [only] its round part is visible’! 


Second, there is a set of verbs that take alternating prefixes to index animates and inan- 
imates: the ‘normal’ prefix ga- encodes inanimate Ps, while an ‘augmented’ prefix ga” 
(pronounced as [ga?]) encodes animate Ps. Illustrations are given in (7). To distinguish 
animate and inanimate objects by choosing a different prefix seems to be a minority 
pattern in Teiwa, attested at least for the verbs listed in (8). 


(7) a. Teiwa (Klamer 2010a: 92) 
Na gi  ga-tad. 
lsG go 3ANIM-strike 
‘I go hit him/her’ 
b. Na gi  ga-tad. 
lsG go 3sc-strike 


‘I go hit it? 
(8) Teiwa transitive verbs with alternating prefixes (Klamer 2010a: 91-92) 
With ga’-prefix With ga-prefix 
ga'-wulul ‘talk with s.o., tell so ga-wulul “talk about sth., tell sth? 
ga’-wultag ‘talk to/about s.o., tell s.o?  ga-wultag ‘talk about sth 
ga'-tewar ‘go/walk together with s.o? ga-tewar “his (manner of) 
ga-tewar | walking 
ga "-tad “hit, strike, touch s.o. ga-tad “hit, strike at sth? 


Note that definiteness does not play a role in the distinction as both definite and indef- 
inite Ps can be indexed. An example of a definite Ps that is indexed is wat u “that coconut’ 
in (6d), while wat 'coconut(s)' in (6b) is an indexed indefinite P. 

The distinction between free and bound pronouns (person prefixes) is not uniquely 
reserved for marking the animacy of a referent but is also used to encode contrastive or 
identificational focus in Teiwa.? This is illustrated in (9), where the animate P is indexed 
on the verb with a prefix in (9a), but is expressed as a free form in (9b), where it encodes 
a focused constituent. 


14A more literal translation of this sentence is “That spoon, a cut thing floats on [it] so that only its leaf is 
visible”. 

New information focus (Lambrecht 1994; Dalrymple & Nikolaeva 2011: 47-48) is marked in Teiwa with a 
dedicated focus particle la and is not further discussed here, see Klamer (2010a: Ch. 11). 
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(9) Teiwa (Klamer 2010a: 407) 

a. Miaag yivar  ga-sii. 
yesterday dog ` 3sG-bite 
"Yesterday a dog bit him? 

b. Miaag yivar ga'an sii. 
yesterday dog 3sc bite 
"Yesterday a dog bit HIM (not me)? 


In sum, the Proto-AP split marking of P plus its semantic alignment system developed 
into an accusative system with DOM in Teiwa. The distribution of the person prefix 
paradigms is lexicalized (normal vs. ‘augmented’). The person prefix that was used for 
human/animate Ps (and some S) in Proto-AP is used in Teiwa to index mostly animate 
Ps. A small class of verbs lexicalized the prefix, and indexes both animate and inanimate 
Ps. The original free pronouns that were used to express A (and some S) in Proto-AP 
function in modern Teiwa to express both A and S (in an accusative system), and also as 
a marker of contrastive focus of P. 


Proto-AP 
A, S "ga(N) P, S "ga- POSS “ge- 
M s e 
" N 
\ 
Ki \ È N 3 
N X M 
Se \ N 
H Y N x 
s : n & . 
H \ s 
KR E y ^ s 
Y Focus: gafan POSS 
Animate: ga-, gaf- ga- 
Inanimate: NP, ga- 


Figure 6: The historical relation between forms encoding P in Proto-Alor- 
Pantar and in Teiwa 


5 Differential object marking in Abui 


Reflexes of the Proto-AP pronouns in Table 1 are attested in Abui, both in free and bound 
forms, as shown in Table 5. Taking the theme vowel /a/, the first (PAT) paradigm reflects 
the Proto-AP prefixes that encoded Ps in the proto-language. The additional paradigms, 
distinguished by vowel grading and vowel lengthening, elaborated the proto-system.ó 


16Most languages of the Alor branch have expanded their verbal prefix paradigms in a similar way as Abui; 
with a prefix containing an /o/ and/or an /e/. Sawila has two verbal prefix paradigms (Kratochvíl 2014b), 
Adang, Klon, and Wersing have three paradigms (Haan 2001; Baird 2008; Robinson & Haan 2014; Schapper 
et al. 2017), and Kamang has seven paradigms (Schapper 2014). This suggests that Proto-Alor may already 
have had two verbal prefixes. 
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Table 5: Abui pronominals (Kratochvíl 2007: 78, 2011: 591, 2014a: 555) 


free I (ear) II(roc) IM (REC) IV (BEN) V (GOAL) 
pronoun 
1sG na na- ne- no- nee- noo- 
2sG a a- e- 0- ee- 00- 
3 - ha- he- ho- hee- hoo- 
DISTR = ta- te- to- tee- too- 
]PLEXC ni ni- ni- nu- nii- nuu- 
ÍPLINC pi pi- pi- pu- pii- puu- 
2PL ri ri- ri- ru- rii- ruu- 


Each of the five prefix paradigms may be used to index Ps, and a vague connection may 
be seen between a particular paradigm and the semantic role of the P it encodes, as 
indicated by the semantic role given in brackets in the column header. 

The second (Loc) paradigm has the theme vowel /e/, and is a reflex of the Proto-Alor 
Pantar possessive prefix "ge- ‘3GEN’. It has often been noted that location and possession 
are semantically related notions: an item is typically located at or near the person that 
possesses it. Abui has drawn on this relation to recruit the possessor prefix of Proto-AP 
as a locative person index.” Paradigm four (BEN) elaborates on the locative paradigm by 
lengthening the theme vowel /e/. Vowel lengthening is a strategy to create new forms 
in Abui, and is also used to create a separate set of goal prefixes on the basis of the 
Recipient paradigm. The recipient (REC) paradigm itself contains the theme vowel /o/. 
While a prefix with this vowel cannot be reconstructed to the level of Proto-AP, it may 
have been present in Proto-Alor as similar forms are found in other languages of Alor, e.g. 
Adang jo (Haan 2001), Klon go- (Baird 2008), and Kamang wo- (Schapper 2014), where 
they have a locative function. Prefixes with /o/ might have evolved from a word that was 
originally locative postposition or verb, and became reanalyzed as a verbal prefix. 

In (10a)—(10e) it is illustrated how the different Abui prefixes roughly correspond to 
semantically different Ps. The prefix expresses, respectively: a patient (10a), a location 
(10b), a recipient/benefactive (10c), a benefactive (10d), or a goal (10e). Note also that 
some of the predicates are complex, consisting of two or more verbs forming a single 
phonological word, as in -[-bol ‘give=hit’ in (10b) and -k-yai ‘throw=laugh’ in (10c) (cf. 
Klamer & Kratochvíl 2010). 


In Abui possessive constructions, the Proto-AP possessor prefix (with theme vowel /e/) is used to express 
alienable possession, while the Proto-AP P-prefix (with theme vowel /a/) is used for inalienable possession 
(Kratochvíl 2007): ne-fala ‘1sc.poss.at-house’ ‘my house’ versus na-min '1sc.POSS.INAL-nose' ‘my nose’. 
Other Alor languages share this innovation. 
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(10) Abui (Kratochvíl 2007: 592) 


a. Na a-ruidi. 
1sG.AGT  2sG.PAT-wake.up.PFv 


‘I woke you up. 


b. Di palootang mi  ne-l=bol. 
SAGT rattan take  1sG.Loc-give=hit 


“He hit me with a rattan (stick). 


c. Fanmalei | no-k-yai. 
Fanmalei 1sc.REC-throw-laugh 


‘Fanmalei laughed at me’ 


d. Ma na ee-bol. 
be.pRox  1sG.AGT  2sG.BEN-hit 


“Let me hit [it] for you. 
e. Simon di noo-dik. 
Simon 3AGT  1SG.GOAL-prick 


“Simon is poking me? 


Although the above examples show rather transparent relations between the prefix 
and the semantic role of the argument it encodes, in most instances where prefixes are 
used in Abui, the relation between form and semantics is either vague, or absent. This is 
because in Abui, P-indexing is also heavily determined by inflectional classes of verbs, 
and inflectional class assignments are mostly idiosyncratic (see below). 

In Abui, the different semantic types of transitive verbs (e.g. verbs of perception, cog- 
nition, speech, or transfer) encode their P in various ways. Here we will not describe all 
the possible patterns, as that would amount to writing another article (see Kratochvil 
2007; 2011; 2014a; Kratochvíl & Delpada 2015b). Rather, we focus here on the differen- 
tial marking of the P of so-called “typical transitive' (Comrie 1989: 111; Haspelmath 2011: 
545) verbs only. Such verbs convey the most typical transitive activities, such as kill, hit, 
kick, carry, search for, take, and hold, which have a highly agentive A and a highly pati- 
entive P. In Abui, even this restricted class of typical transitive verbs shows significant 
differentiation in the marking of P, as we will discuss now. 

In Abui, as in Teiwa, animacy determines whether or not a prefix is used on the verb. 
This is illustrated in (11a)- (11b): the inanimate P kanai do ‘these pili nut(s)’ is not indexed 
on the verb bol ‘to hit’ in (11a), while the human body part netoku ‘my leg(s)’ is prefixed 
on bol in (11b). Note that both NPs are definite: possessives and NPs marked with the 
demonstrative do are definite in Abui (Kratochvíl & Delpada 20152). 


Unlike English ‘hit’ and many other verbs, Abui bi-valent bol can take different prefixes, indicating ar- 
guments with different semantic roles and often somewhat different senses: *PAT-bol, REC-bol ‘hit at so, 
BEN- bol ‘hit for/instead of s.o., GOAL-bol ‘dust off sol 
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(11) Abui (Kratochvíl 2014a: 566) 


a. Di kanai do bol took. 
SAGT pilinut prox hit drop 
“He was hitting pili nuts (and) dropping [them]. 

b. Baloka ne-toku he-bol he-balasi ba... 
k.o.grass 1sG.Poss-leg 3.Loc-hit 3.LOC-beat.Prv SIM 


“The baloka grass hit my legs slashing them... 


The variation in (11a) and (11b) is an instance of asymmetric morphological alternation 
between a nominal P and a P indexed on the verb with an overt morphological exponent 
Witzlack-Makarevich 8 Seržant (2018 [this volume])). It is parallel to the reconstructed 
Proto-AP pattern and to the pattern in Teiwa, illustrated in (3)-(4) above. In addition, 
animacy also determines marking of P in Abui following a symmetric system, where 
both alternatives are morphologically marked. In (12a), the inanimate P of puna ‘hold’ 
is encoded with a Loc prefix, while in (12b), the same verb takes an animate P which 
is indexed with a GOAL prefix. This type of DOM marking in Abui is analogous to the 
symmetrical pattern in Teiwa, illustrated in (7)-(8) above. 


(12) Abui (Abui corpus: E15BD071, E15BD072) 


a. Maama, na mahiting | he-puna yo! 
father lsG.AGT meat 3.L0c-hold.1pFV ` MD.AD 


‘Father, I will hold the meat (while you slice it)!’ 


b. Di noo-puna! 
3.AGT  1sG.GOAL-hold.1PFv 


“He is grabbing (groping) me!” 


In addition, Abui P-marking is also sensitive to the semantically more narrow distinc- 
tion between human and non-human referents. When the referent of P is human, the 
main transitive verb combines with another (generic) verb in a complex predicate where 
the P-prefix attaches to the generic verb, as illustrated in (13). The semantic contribution 
of the generic verb “give” in (13) is to flag the presence of a human P. In (13) we illustrate 
two such serial constructions: -I- bol ‘give hit’ and -I-balasa ‘give beat’. In both cases, the 
referent is human, therefore must prefix to -l ‘give’. When a referent is not human, the 
prefix is not expressed in such a serial construction with JL but rather attached directly 
to the main verb, as was illustrated in (11a) and (11b) above. Kratochvíl (2014a: 567-569) 
provides further examples of this asymmetrical DOM pattern, which is sensitive to the 
distinction [+/- human]. This pattern is quite frequent in Abui and typical for verbs of 
change (impingement, locomotion, search verbs) and spreading into emotion and cogni- 
tion verbs. 
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(13) Abui (Abui corpus: N12.070) 
Markus di  ne-l=bol ne-l=balasa. 
M. 3sG  1sG.Loc-give=hit 1sG.Loc-give=beat.IPFV 
"Markus gives me a beating (lit. hits me (and) beats me). 

Furthermore, besides animacy and humanness, the affectedness of P also plays a role 
in the choice of prefix. This DOM type is the topic of Kratochvíl & Delpada (2015b). Abui 
systematically encodes the degree of affectedness for predicates that describe change (ob- 
servable change, (loco)motion, physical impingement, and going out of or coming into 
existence).? In terms of Beavers's (2011) account of affectedness, the Abui PAT-indexed 
verbs indicate a maximum degree of affectedness while the Loc-indexed verbs shift one 
degree lower (Kratochvíl & Delpada 2015b: 232). The alternation of the degree of affect- 
edness can be tested with entailments, as shown in (14a)- (14b). The PAT-indexed verb 
entails a maximal change to the effect described by the verb and this change cannot be 
negated by the entailment (14a), but this is possible with Loc-indexed verbs, as shown 
in (14b). 


(14) Abui (Abui corpus: E15BD51, E15BD52) 


a. di kawen ha-komangdii *haba  de-i-bula 
3.AGT machete 3.Loc-make.blunt.prv but 31.Loc-have-be.sharp 
“He made the knife blunt, #but it's still sharp. 

b. di kawen he-komangdii haba  de-i-bula 


3.AGT machete 3.Loc-make.bluntprv but ^ 3rroc-have-be.sharp 
“He made the knife blunter, but it's still sharp. 


A number of verbs of change participate in this DOM pattern in Abui, with some exam- 
ples given in (15)-(18). The entailments work in the same way as for the verb -komangdii 
“make blunt’ above. It should be noted that the Abui senses may map sometimes onto 
different verbs in English, underlining the semantic distinctions invoked by this DOM 
pattern. 


(15) Observable Change verbs (Kratochvíl & Delpada 2015b: 222) 

a. +Affected: PAT ha- 
ha-lilri ‘boil it’ 
ha-siki ‘separate it’ 
ha-kol ‘tie it up’ 
ha-kuya ‘expose it’ 

b. -Affected: Loc he- 
he-lilri ` ‘warm it up’ 
he-siki ‘split it 
he-kol ‘tie it 
he-kuya ‘peel it’ 


Other AP languages have been described having DOM systems where ‘affectedness’ is one of the trigger 
features: Blagar (Steinhauer 2014: 188-189); Kamang (Fedden et al. 2014: 64-66); Klon (Baird 2008); Sawila 
(Kratochvil 2014b); Kula (Williams 2016). 
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(16) Move and Stay at Some Location verbs (Kratochvíl & Delpada 2015b: 227) 


a. «Affected: PAT ha- 
ha-taang ‘give it away’ 


ha-fil ‘pull it’ 

ha-bel “pull it out’ 

ha-baang ‘put on (its lid)’ 
ha-kil ‘turn it upside down’ 


b. —Affected: Loc he- 
he-taang ‘pass it along’ 


he-fil ‘pull on it’ 

he-bel ‘pluck it’ 
he-baang ‘put on shoulder’ 
he-kil ‘put it out’ 


(17) Physical impingement verbs (Kratochvil & Delpada 2015b: 227) 
a. +Affected: pat ha- 
ha-dik ‘pierce it, stab through it’ 
ha-ril ‘ram it in’ 
ha-taakda ‘stab to death’ 
ha-keila ‘plug it’ 
h-afuui ‘scoop it up’ 
h-ahii ‘remove it’ 
ha-fuuidi ‘flatten it’ 
b. —Affected: Loc he- 
he-dik ‘stab at it’ 
he-ril ‘plant it in’ 
he-taakda ‘skewer it’ 
he-keila ‘block it’ 
he-afui ‘scoop it’ 
he-ahii ‘select it, pick it’ 
he-fuuidi ‘made it flatter’ 


(18) Go Out of Existence verbs (Kratochvil & Delpada 2015b: 228) 
a. +Affected: pat ha- 
ha-lak ‘destroy it’ 
h-akung ‘extinguish it’ 
b. —Affected: Loc he- 
he-lak “demolish it’ 
he-akung ‘shade it’ 


And finally, P-indexing is also restricted by Abui verbal inflectional classes, which in 
some cases stipulate the P-index type as PAT, irrespective of the semantics of the event 
expressed by the verb, as described in Fedden et al. (2013; 2014); Fedden & Brown (2017). 
In these studies the prefixing behaviour of Abui verbs was examined. About 10% of the 
verbs always index the P with the par prefix and do not allow any symmetrical DOM. 
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This particular inflectional class includes both typical transitive verbs, describing events 
of observable change (19), (loco)motion (20), physical impingement (21), and going out 
of or coming into existence (22)) (e.g., -balak ‘to hit, punch s.o./sth’ and -basa “to brush 
off sth’), but also verbs of speech, cognition and transfer, as well as verbs of perception, 
posture, placement and sound (Fedden et al. 2014; Kratochvíl & Delpada 2015b). It is 
possible that these verbs represent an older layer of the Abui lexicon, and reflect an 
older stage of its grammar, before the systematic DOM alternation between PAT- and 
Loc-indexed verb was fully grammaticalized. 


(19) Observable Change verbs (Kratochvil & Delpada 2015b: 222) 
ha-basa ‘brush him off, dust it? ha-weel ‘wash him, bathe him’ 
ha-kuol ‘shave it’ h-iel ‘roast it 
ha-tamadia ‘repair it’ 


(20) Move and Stay at Some Location verbs (Kratochvil & Delpada 2015b: 223) 


ha-fik ‘pull it, pull him” ha-kuoila ‘topple it’ 

ha-ai ‘add it’ ha-bi ‘lean against it’ 
ha-suonra ‘push it’ ha-kai “drop it, trip him’ 
ha-reng ‘turn to it 


(21) Physical impingement verbs (Kratochvil & Delpada 2015b: 224) 


ha-balak ‘punch bm!  h-uol ‘hit/strike him’ 
ha-laanga ‘gropehim’ ha-paakda ‘slap him’ 
ha-taak ‘shoot him’ 


(22) Go Out of Existence verbs (Kratochvil & Delpada 2015b: 224) 
ha-al ‘burn it’ ha-pok ‘cover it 
ha-fuul swallowit ha-yol ‘bury it’ 


The inflectional verb class illustrated in (19)-(22) contrasts with the PAT-LOC alternat- 
ing verbs in (15)-(18) in that the degree of affectedness of their P is not fixed. This can be 
seen when the entailment is a ‘failed’ reading, as shown in (23a)-(23c), something not 
possible for the pat-indexed verbs that participate in the symmetrical DOM discussed 
above. For more details, see Kratochvil & Delpada (2015b). 


(23) Abui (Abui corpus: E15BD34, E15BD35, E15BD36) 


a. na ha-fik-i haba  burook naha 
1sG.AGT  3.PAT-pull-PFv but but not 
'I pulled it but it didn't move? 

b. na ha-fik-i haba sik naha. 


1sG.AGT  3.PAT-pull-PFv but snap not 
T pulled it but it didn't snap: 


c. na ha-fik-i haba dara  de-yal mia. 
1sG.AGT  3.PAT-pull-Prv but still  3rAr-place bein 


> 


‘I pulled it but it is in its place (it's too heavy) 
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Clearly, this class does not show any evidence of symmetrical DOM as it marks P 
always in the same way (with a PAT prefix). Yet it is important to mention it in the context 
of the current paper, because it shows that while Abui differentiates Ps in symmetric 
and asymmetric ways, along a number of different semantic dimensions, the language 
also has a reasonably large class of bivalent verbs that do not take part in symmetrical 
differential marking of P at all. 

The DOM pattern of alternation between Loc- and pat-indexed verbs is attested with 
22% of the sample investigated by Fedden et al. 2013; 2014. Furthermore, verbs in this 
class can also combine with other series (BEN, REC, Or GOAL), i.e. alternate symmetri- 
cally. At the same time, verbs in this class can also occur without a prefix and alternate 
asymmetrically in complex predicates (see Fedden et al. 2014: Table 5). In general, the 
three additional series (BEN, REC, and GOAL) are less restricted and combine on average 
with about 87% of the roots. This is expected, given their later development and greater 
productivity. 

In sum, there is a variety of factors involved in the marking of the objects of the typical 
transitive verbs in Abui. These include: 


e the semantic role of P (where Ps that are semantically patient, locative, benefactive 
or goal can be marked differently); 


e the inherent semantic properties of the argument (whether P is animate or not, 
whether P is human or not); 


e the relation between the verb and its argument (whether P is affected or not and 
to what degree); 


e the inflectional verb class (which determines whether or not P is marked differen- 
tially, and how it is marked differentially, i.e. using a symmetrical or asymmetrical 
pattern). 


The Abui data clearly show that in a single language, DOM can have multiple triggers, 
involving inherent lexical argument properties, inflectional classes, and event semantics; 
and combine symmetrical and asymmetrical morphological alternations. In a language 
family such as the AP family, which tends to index P over S/A, languages may develop 
in a direction where they elaborate on the encodings of P in new ways, as Abui demon- 
strates. Figure 7 shows how the modern Abui morphemes used for DOM relate to the 
reconstructed forms in Proto-AP. 

Unlike in Teiwa, Abui retained the semantic alignment of Proto-AP, where S could 
sometimes be marked as P. In numerous cases, S arguments can be indexed on verbs 
as if they are Ps. In general, such S arguments have a more affected, and less volitional, 
semantics than free-standing S arguments (Kratochvíl 2007; 2011; 2014a; Fedden et al. 
2013; 2014; Fedden € Brown 2017). 
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Proto-AP 
A, s "ga(N) P, S “ga- POSS *ge- 
À A ET 7 l 
| 1 x y 1 
l \ S » [ 
I \ Ze H [ 
| \ a 1 H 
l 1 y E 1 
l \ Ac F 1 
1 ue € 
y t kl 
A P,S : PW ; 
ha- «Affected |,’ ` y 
he- -Affected » "i he- (AL) 
hee- poss ha- (INAL) 
ho- 
hoo- 


Figure 7: The historical relation between forms encoding P in Proto-Alor- 
Pantar and Abui. 


6 Conclusions 


Sharing a common ancestor that had DOM, Teiwa and Abui still mark objects differ- 
entially, and in both languages, reflexes of the same proto-morpheme are used in the 
differential marking of P. Yet, there are many differences between the two languages in 
the proto-forms that have been retained and innovated, and in the way DOM is applied. 

In their morphological expression, there are two dimensions in which Ps are differen- 
tiated in both Teiwa and Abui. The first is asymmetrical: either P is expressed as a verbal 
prefix (with an optional co-referent pronoun or NP in the clause), or P is expressed as a 
free pronoun or nominal phrase. Second, Ps may be differentiated symmetrically, by the 
variable choice of a P-prefix depending on the semantics of P. Both strategies are used 
in both languages, but the symmetrical strategy involves two prefixes in Teiwa and five 
prefixes in Abui. The DOM patterns are summarized below (the information structure 
uses are not included). 

Also, the factors triggering DOM are different: in Teiwa it is mostly based on the in- 
herent properties (animacy) of P, while in Abui there are many other triggers besides 
the animacy of P, including the affectedness relation between the action and the P ref- 
erent and the inflectional class of the verb. Furthermore, Abui has developed an extra, 
third, formal strategy to differentiate human Ps from non-human ones in a serial verb 
construction. 

The reconstructed alignment system of Proto-AP was semantic. In Teiwa, this system 
has evolved into an accusative alignment system, but the original system was retained 
and further complexified in Abui. This indicates that alignment systems are not static 
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and can be modified and complexified over time by putting morphemes of an ancestor 
language into new uses and creating new forms, e.g. by adding symmetrical paradigms 
of person-indexing prefixes. 

An interesting comparison can be made with the semantic alignment systems of the 
Papuan languages of North Halmahera discussed in Holton (2008). While there is evi- 
dence for syntactic alignment in proto-North Halmaheran, many of the modern North- 
Halmaheran languages have innovated semantic alignment (Holton 2008: 274-275). In 
the AP languages, the situation is the opposite: the semantic alignment is reconstructed 
for the proto-language, and the syntactic alignment in Teiwa is an innovation. The path 
of historical evolution of alignment system can therefore not be unidirectional (from 
syntactic to semantic alignment), but the evolution in both directions is possible, and 
facilitated by DOM (in Alor Pantar) and optional or pleonastic marking (in North Halma- 
hera). 

It seems that languages that have semantic alignment (or differential S marking) along- 
side DOM, such as Abui, tend to develop more complex systems of DOM than languages 
with accusative alignment, such as Teiwa. 

In the development of their respective DOM systems, Teiwa and Abui underwent dif- 
ferent morphological changes. The Proto-AP prefix *ga- is reflected in the Teiwa prefix 
that encodes topics and animate Ps, as well as in Teiwa possessors. In Abui, this prefix 
is reflected as the PAT prefix and could be the source of the innovated prefixes as well 
(Klamer & Kratochvil 2012). The PAT prefix is the most semantically bleached prefix of all 
five of the Abui P-prefixes, as it is obligatory for a semantically diverse class of verbs that 
makes up 10% of the total number of verbs investigated in Abui. Most of these verbs en- 
code events describing various types of change (observable change, (loco)motion, phys- 
ical impingement, going out of and coming into existence) - suggesting a relationship 
with affectedness. The Abui Loc and BEN prefixes feature the theme vowel /e/, reflecting 
the Proto-Alor Pantar possessive prefix "ge- ‘3GEN’, but in Teiwa no reflex of this prefix 
has been retained. 

The proto-pronoun *ga(N) that was used to encode A and S in Proto-AP is reflected 
in modern Teiwa as the free pronoun ga on, but in Teiwa it encodes contrastive focused 
Ps. In Abui it encodes A, but the final nasal has been lost. Abui has also innovated a new 
prefix paradigm with a theme vowel /o/, and two additional paradigms by lengthening 
the vowel of existing paradigms. Apart from the use of reflexes of the Proto-AP object 
prefix "ga-, very few similarities remain between the morphemes that are used in Teiwa 
and Abui DOM. 

In sum, this study has shown that the evolutionary path of DOM from Proto-Alor Pan- 
tar into its daughter languages has both stable and unstable features. Stable features are 
the inherent semantic feature of humanness/animacy of P that is being coded, and the 
shape of the person prefix that is used in the coding. However, the semantic alignment 
system of Proto-Alor Pantar appears to be volatile, as it changed to accusative in Teiwa. 
This is not an unexpected result since alignment patterns are sensitive to morphological 
and phonological changes. Also, a language can develop additional triggers for DOM as 
well as the additional person markers that it needs to encode these additional types of 
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Ps alongside inflectional verb classes, as has happened in Abui. In general, the DOM trig- 
gers in Abui shifted away from being purely participant-related, to include event-related 
features (degree of affectedness) as well. The Alor Pantar languages show that alignment 
systems are not static: their forms and triggers may be modified and complexified quite 
substantially over time. 
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Abbreviations 

1 person markers GOAL  undergoer prefix paradigm 
second person (goal-like) 

3 third person INC inchoative 

AD addressee-perspective IND Indonesian 

AGT agentive pronoun LNK linker 

ANIM animate LOC undergoer prefix paradigm 

AP Alor Pantar (location-like) 

APPL applicative MD medial 

ASSOC associative MOD modal 

BEN undergoer prefix paradigm PAT undergoer prefix paradigm 
(benefactive-like) (patient-like) 

CONT  continuative PRIOR  priorative 

DEM demonstrative PROX proximate 

DIST distal REC undergoer prefix paradigm 

DISTR distributive (recipient-like) 

DOM differential object marking  SEQ sequential 

EVID ` evidential SIM simultaneous 

FOC focus SPC specific determiner 


TOP topic 
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The Spanish language is known for its widespread phenomenon of Differential Object Mark- 
ing (DOM). A particularly interesting feature of DOM in contemporary Spanish relates to 
the obligatory use of a double system of marking - a “flagging” preposition and an "indexing" 
clitic (Haspelmath 2005) - in the domain of the full personal pronouns. The prepositional 
marker goes back to the very beginnings of the language, whereas the cross-referencing 
strategy, also called clitic doubling, is the product of a much later development, which join- 
ing forces with the existing older form gave rise to the twice-marked pronouns. In this paper 
I focus on the origin of Spanish indexing DOM and, through a careful examination of the 
first contexts of use, I propose that the relevant notion of "topicality" implicated in the evo- 
lution of indexing DOM is not animacy, but has to do with the role participants play in the 
event structure and the organization of these roles into a topical case hierarchy (Givón 1976). 


1 Introduction 


It is known that Spanish has a robust system of Differential Object Marking (Bossong 
1991; 1998). A particularly interesting feature of Spanish DOM is that some direct objects 
require double marking. This phenomenon characterizes the stressed object personal pro- 
nouns, which in present day Spanish impose the use of the preposition a along with the 
presence of an unstressed person form - a verbal clitic - showing the relevant agreement 
features with the pronominal object phrase: 


(1) Porque tu me am-as a mí ¿no es cierto? 
because you.NoM acc  love-PRS.285G Acc I not is certain 


“Because you love me, right?” (1999, Jorge Volpi, En busca de Klingsor, CREA) 


The preposition a represents the older and more basic instrument of Spanish DOM, 
traceable to the earliest texts. According to the hypothesis outlined in Pensado (1995b), 
and now commonly accepted (Torrego Salcedo 1999; Leonetti 2004; lemmolo 2010), the 
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development of a into a DOM marker has its roots in contexts where Latin ad, meaning 
“with regard to, as to”, indicated a shift of topic.! The topicalizing function of ad was 
passed on to various Romances via vulgar Latin - initially confined to the personal pro- 
nouns of first and second person in a dative or accusative role - and from there evolved 
towards the grammaticalized use of a differential object marking in Spanish (Pensado 
1995b).? 

More specifically, the history of Spanish a shows the evolution of a DOM marker ex- 
tending gradually downwards along the animacy hierarchy, in close interaction with a 
parameter of definiteness (García & van Putte 1995; Melis 1995; Aissen 2003; von Heu- 
singer & Kaiser 2005; Laca 2006). The evolutionary path emerges from comparing the 
situation reflected by the earliest available text (Cantar de mio Cid, dating most probably 
from around the turn ofthe 13th century; cf. Montaner 1993: 8) with that of contemporary 
Spanish. At the beginning, one observes a compulsory use of a with both the stressed 
personal pronouns and the human-referring proper names, as opposed to the incipient 
and optional marking of the common nouns indicating definite (sets of) individuals. In 
today's Spanish, on the other hand, following the progressive descent of the preposition 
from definite to non-specific indefinite persons, a introduces nearly all human objects, 
while the inanimate objects are usually left unmarked. To illustrate the prevailing situa- 
tion in contemporary Spanish, Torrego Salcedo (1999: 1781-1782) offers this contrast: 


(2 a. Traj-eron a un amigo con ellos. 
bring-PFv.3PL Acc a friend with them 


“They brought a friend with them. 


b. Traj-eron una maleta con ellos. 
bring-PFV.3PL a suitcase with them 


“They brought a suitcase with them: 


Spanish a thus profiles a well-attested path of evolution for DOM markers. Descending 
from an as-for topic expression of late Latin, the preposition is originally used to signal 
the promotion of a salient pronominal object referent to the status of clausal topic in a 
pragmatically marked construction. With the passage of time, however, as suggested by 
the earliest records of the Spanish language, the preposition evolves into a differential 
marking device extended to direct objects which are no longer topics, but which conserve 


The topicalizing function of Latin ad manifests itself in examples where the topic shifter forms part of a 
larger phrase (quod ad me attinet, X “as far as I am concerned, X’, quod ad Xenonem, X “as for Xenon, X’), 
and in contexts where it is used alone (ad ea autem, quae scribis de testamento, X ‘with regard to what you 
write about the will, X") (Pensado 1995b). 

?Pensado reconstructs the historical path of Spanish a on the basis of a careful examination of Vulgar Latin 
and early Romance data, which support her hypothesis that the origin of a as a topic marker goes back to 
a construction of Vulgar Latin restricted to the personal pronouns of first and second person both dative 
and accusative; cf. Ad mihi, (mihi) dixit “To me, he told (me)’ or Ad mihi, (me) amat ‘Me, he loves (me) 
(Pensado 1995b: 203). Her proposal to situate the beginnings of Romance DOM in the area of the personal 
pronouns, as she notes, ties in with what other authors had pointed out in the past (Meier 1948; Rohlfs 
1971). 
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features of topicworthiness such as animacy and definiteness (Iemmolo 2010).? 

The second instrument of Spanish DOM, on the other hand, is the product of a more 
recent development, which the obligatorily a-marked strong personal pronouns begin 
to undergo around the turn of the 16th century, that is to say, during the transition pe- 
riod between medieval and renaissance Spanish (Keniston 1937: 83; Silva-Corvalán 1984; 
Rini 1991; Gabriel & Rinke 2010). As shown in (1) above, the new device consists of a 
coreferential clitic pronoun, morphologically bound (but not attached) to the verb. This 
phenomenon is known as clitic doubling (for the definition of the coreferential forms in 
terms of clitics, see $2), and the relation of clitic doubling to DOM has been acknowl- 
edged (Bossong 1998: 221-224). Indeed, the use of the coreferential pronoun in Spanish 
separates the higher-ranked pronouns, which display the clitic along with a, from the 
lower-ranked nominal objects, marked with a alone (human) or taking no marking (inan- 
imate). 

In Haspelmath's (2005: 2) terminology, Spanish a instantiates the "flagging" type of 
argument marking (= coding by case affixes and adpositions), whereas the clitic corre- 
sponds to the “indexing” type (= cross-referencing or agreement). From this perspective, 
the peculiarity of the stressed object personal pronouns of Spanish resides in conjoining 
two kinds of DOM: flagging DOM (a) and indexing DOM (clitic). Another way of refer- 
ring to the double marking of the Spanish pronouns is proposed by Iemmolo (2014), who 
reserves the label DOM for the flagging type of marking and calls the other mechanism 
DOI (Differential Object Indexation). 

The clitic doubling strategy employed for the purpose of Spanish DOM is the central 
topic of the present paper. Scholars have been interested in the question of why the per- 
sonal pronouns became subjected to the new type of marking, and different hypotheses 
have been put forward (Silva-Corvalán 1984; Rini 1991; Gabriel & Rinke 2010). None, how- 
ever, as we shall see, manages to satisfactorily account for the change associated with 
the turn of the 16th century. This leaves room for a new attempt at explaining how the 
change came about. The way I intend to approach the topic at hand is through a careful 
examination of textual sources in which the new type of marking has the character of an 
incipient phenomenon, affecting a few pronouns and leaving the rest untouched, under 
the assumption that defining what the selected few share in common may shed light on 
the original motivating force behind the change. 

As background to the analysis I will present, two important facts have to be mentioned. 
First, it should be pointed out that the introduction of clitic doubling into the pronominal 


5] must add that even though the split between human and non-human objects is held to define Spanish 
DOM (Leonetti 2004: 82) - the status of the non-human animate objects being unclear (von Heusinger 2008: 
4) -, the system is actually much more complex owing to its sensitivity to various factors beyond animacy, 
such as the properties of definiteness and/or specificity of the object referents, the aspectual features of 
the predicate, the semantics of the event denoted by the verb, as well as the relation holding between 
the subject and the object. The interaction of animacy with these additional parameters, accounting for the 
appearance of unmarked human objects and a-marked inanimate objects in specific discourse contexts, has 
been examined in a number of studies with important contributions to our understanding of the intricacies 
of Spanish DOM (Kliffer 1984; Pensado 1995a; Torrego Salcedo 1999; Company Company 2002; Delbecque 
2002; Aissen 2003; von Heusinger & Kaiser 2003; Leonetti 2004; García García 2007; 2014; von Heusinger 
2008; among others). 
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domain did not imply a creation in the strict sense of the word; the coreferential form 
had a long history of appearing in topicalizing constructions, known as left- and right- 
dislocations, where it was used to bind a detached object constituent to the core clause. So 
it will be necessary to look at these structures in order to understand how they prepared 
the way for the development of indexing DOM with the pronouns. 

Second, it has to be borne in mind that the stressed object pronouns affected by the 
change have been emphatic forms throughout the history of Spanish. Their selection 
in specific discourse contexts always signals a deliberate intention on the part of the 
speaker to highlight something about the referent of the pronoun. More will be said 
below on the split between stressed and unstressed forms within the Spanish personal 
pronoun system. For the moment, the fitting observation is that the development of 
clitic doubling as a device for DOM cannot be explained without taking into account the 
crucial emphatic value of the targeted pronominal items. 

Anticipating the results of my analysis, I will argue that the emergence of indexing 
DOM in Spanish appears to have involved a notion of topicality, but not one in which 
animacy was the relevant feature, in contrast to a. As suggested by Givón (1976: 152), 
topicality should be visualized as encompassing a number of binary hierarchic relations, 
among which the author includes one that concerns the role of participants in the event 
structure. On the role dimension, entities are ranked according to the degree to which 
their participation contributes to the coming about of the event (more involved partic- 
ipant > less involved participant). This binary relation is assumed to underlie the case 
hierarchy (agent > dative > accusative), in which the more "topical" participants, in ad- 
dition to being typically human and definite, rank above the accusative object from the 
point of view of their higher degree of involvement in the action. My aim is to show that 
the grammaticalization of indexing DOM in Spanish closely interacted with this specific 
dimension of topicality. The primary evidence for this proposal is that Spanish indexing 
DOM will be seen to favor the dative pronouns before generalizing to all personal ob- 
ject pronouns (indirect and direct). Further support comes from the later extension of 
indexing DOM to the indirect (not direct) object noun phrases. 

The interaction between flagging and indexing DOM in Spanish thus offers a complex 
panorama of historical developments, which can be divided in three major stages: 


e throughout medieval Spanish, flagging DOM and the indexing device (in topic 
constructions) operate independently from one another (see 83); 


* in renaissance Spanish, indexing DOM becomes a grammaticalized feature of the 
personal object pronouns, both dative (marked by a homophonous a form) and 
accusative (obligatorily DOM flagged). This is the period in which the two types 
of DOM meet, and their convergence is the focus of the present paper; 


* in modern Spanish, indexing DOM spreads to the dative noun phrases, whereas 
the nominal direct objects only show flagging DOM or are left unmarked (Melis 
& Flores 2009, and see below 84.1). 
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The paper is organized as follows. $2 provides a brief overview of the object personal 
pronouns of Spanish. In 83 the older use of the coreferential pronoun with dislocated 
object phrases is examined. §4 is dedicated to the development of Spanish indexing DOM: 
The general properties ofthe diachronic change are sketched in $4.1; previous approaches 
are discussed in $4.2; the hypothesis set forth in this paper is outlined in $4.3; the corpus 
of data is described in 84.4; and the analysis of the data is carried out in 84.5. 85 concludes 
with a summary of the paper. 


2 The Spanish object person forms 


For the purpose of this paper, a brief introduction to the Spanish personal pronoun sys- 
tem will be helpful. Of specific interest are the object pronouns, which show a division 
into stressed and unstressed forms. The former are referred to in terms of “full”, "strong" 
or “tonic” pronouns, whereas the latter are called “weak” pronouns or “clitics”. In Table 1, 
a simplified picture of the object paradigm based on Penny (1991: 119) is presented. It is 
important to observe (for the change to be discussed) that across the paradigm, with a 
few exceptions in the third person area, identical forms cover both the accusative and 
dative realizations of the pronouns.* 


Table 1: The Spanish object person forms 


ACCUSATIVE DATIVE 


stressed unstressed stressed unstressed 


1 SG mí me mí me 

2 SG ti te ti te 

3SG masc. él lo él le 
fem. ella la ella le 
neuter ello lo ello le 

1 PL nos(otros) nos nos(otros) nos 

2 PL vos(otros)  (v)os vos(otros) ` (v)os 

3PL masc. ellos los ellos les 
fem. ellas las ellas les 


When a language possesses a pronominal system with a similar division, it is usu- 
ally the case that the unstressed, that is, phonologically attenuated, forms encode highly 
topical and cognitively accessible referents (Siewierska 2004: 174). This tendency is con- 
firmed by Spanish, where the weak object pronouns are, and always have been, the 


^The segments in parenthesis indicate changes that took place in late Old Spanish (end of 15th cent.), namely, 
the reduction vos » os and the expansion vos » vosotros, followed at a later stage by the analogical expansion 
nos » nosotros. I have excluded the contemporary deferential forms of address usted and ustedes. Nor does 
my overview mirror the early phenomenon of leísmo (which continues in standard Peninsular Spanish), 
whereby the dative form le is used as a direct object form with masculine referents. 
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canonical forms used to refer to the participants that are deictically or anaphorically 
anchored in the discourse (cf. me vio ‘(s)he saw me’; lo vi ‘I saw him’). 

What did change in the course of time is the grammatical status ofthe weak object pro- 
nouns. These began as phonologically bound forms, which had to “lean” on a preceding 
or following word for accentual reasons, but enjoyed a certain degree of independence 
from a syntactic point of view. Over time, however, the weak object pronouns were led 
to transform into elements definable as clitics on the basis of their morphological binding 
to the verb (immediately preceding the finite verb or attached at the end of imperative 
and non-finite verbal forms). As discussed in the literature, the cliticization of the weak 
pronouns - product of a gradual loss of positional and combinatory options on the syn- 
tactic level - was completed by the early 17th century (Rivero 1986; Rini 1990; Fontana 
1993; Fernández Soriano 1999; Nieuwenhuijsen 2006). 

It is worth noting that the period during which the weak pronouns were evolving into 
clitics (15th-16th century) more or less coincides with that of the rise of indexing DOM. 
A relation between the two phenomena has to be established, since the development 
of a type of object-verb agreement in the area of the strong personal pronouns was no 
doubt facilitated by the newly acquired clitic status of the weak person forms (Rini 1990; 
Enrique-Arias 2003). 

The strong object personal pronouns, on the other hand, behave like (prosodically and 
morphologically) independent noun phrases, associated with one peculiar feature: they 
are emphatic. So when a strong pronoun surfaces in discourse, some kind of special effect 
is intended, typically, a contrast: the individual encoded by the pronoun is compared or 
opposed to other referents, whether explicitly or implicitly (Luján 1999). 

To illustrate, consider this pair of late medieval examples: 


(3) a. pues  diz-es que me  am-as 
since say-PRS.28G COMP lacc  love-PRS.25G 
“since you say that you love me’ (15th c., Bursario, CORDE) 

b. miémbr-a-te que | por am-ar a mí LI 
remember-IMP-25G.REFL COMP for love-INF acc I 
mat-aste a tres hermanos míos 
kill-prv.zsc acc three brothers mine 


‘remember that for the sake of loving me [...] you killed three of my brothers’ 
(15th c., Bursario, CORDE) 


In (3a) the weak form me corresponds to the way in which a first person functioning 
as direct or indirect object is expected to appear in most discourse contexts. But on 
occasion, as in (3b), the speaker chooses the tonic instead (accusative mi preceded by 
obligatory a), the emphatic force of which is in this case called upon to underscore the 
explicit contrast established between an act of love and a triple murder. The pronouns 
that will undergo clitic doubling are these emphatic forms. 
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3 Coreferential weak pronouns in left- and 
right-dislocations 


In this section, left- and right-dislocated sentences motivating the occurrence of a coref- 
erential pronoun are examined. They constitute a very old phenomenon that is present in 
the earliest Spanish texts and indeed goes back to Latin, where topicalized constituents 
were often accompanied by a resumptive pronoun (Pensado 1995b: 198). From the point 
of view of this study, the dislocations in question are of particular significance because 
scholars have relied on these pragmatically marked structures to explain the origin of 
Spanish indexing DOM. At the end of the section, the plausibility of tracing the indexing 
device back to the dislocations will be evaluated. 

The most suitable text to examine the older function of the coreferential pronoun is 
the epic poem Cantar de mio Cid, especially rich in examples (Menéndez Pidal 1964: 323). 
These are built with different kinds of object phrases. Their common property lies in the 
peripheral position the object occupies on the left or right end, along with the occurrence 
of a coreferential pronoun in the core clause. For example, in (4), the DOM flagged direct 
object (a las sus fijas his daughters”) has been detached to the left periphery and is 
resumed by the weak form las, which reproduces the case, gender and number features 
of the detached noun:? 


(4) a las sus fijas | en bragos las prend-ía 
Acc the his daughters in arms they. ACC.FEM  take-IPFV.3SG 


‘his daughters, he embraced them’ (v 275) 


The vertical bar in (4) symbolizes the caesura, indicative of an intonation break in 
the recitation (Gabriel & Rinke 2010: 71, with a reference to Fontana 1993: 263), and 
in this sense helpful for the recognition of a dislocated structure. With respect to left 
dislocations, Lambrecht (1994: 183) points out that they are often used “to mark a shift 
in attention from one to another of two or more already activated topic referents”. The 
Cantar de mio Cid illustrates this nicely insofar as many of its left dislocations involve 
central figures of the poem, for instance, the Cid’s daughters, as in (4). 

In (5), a DOM-flagged strong personal pronoun (a vós ‘you’) occupies the right periph- 
ery and is accompanied by the coreferential form vos: 


(5 aquéllas vos acomiend-o | a vós, abbat don Sancho 
those.ACC.FEM  you.DAT entrust-PRS.1SG DAT you abbot don Sancho 
‘I now entrust those [girls] to you, you abbot don Sancho’ (v 256) 
Right dislocations, also called afterthought-topics, are less easily recognizable in Span- 


ish, because the detached object may appear as if it were occupying the canonical postver- 
bal position (of direct and most types of indirect objects). In this example, however, the 


The examples of the Cid are cited from Montaner's (1993) edition. 

$Menéndez Pidal (1964: 400) discusses another diagnostic for the identification of pragmatically marked 
structures in the Cantar, related to the position of the coreferential pronoun in the structure of the verse. 

7In (5) the plural form vos is used as a deferential form of address. 
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correct analysis gains support from the presence of the caesura.? Right dislocations pose 
an additional challenge to the extent that their function in discourse continues to be a 
matter of some dispute. Broadly speaking, they are supposed to bear on the identity of 
the referent of the coreferential form in the core clause, adding explicitness for the ben- 
efit of the addressee (Lambrecht 1994: 2003), or, from a wider perspective, providing an 
informational “update” that is meant to replace, correct or partially adjust elements con- 
tained in the core clause (Escandell-Vidal 2009: 856-859, following Vallduví 1992). The 
right dislocations of the Cid would have to be examined in detail in order to verify these 
proposals. 

Of greater interest to us is the fact that the detached objects in (4) and (5) are both 
marked with a. The co-occurrence of a and the coreferential pronoun in the disloca- 
tions of the Cid explains why it has been claimed that the two devices have a long his- 
tory of working jointly in the service of Spanish DOM (Laca 1995; Melis 1995; Leonetti 
2004; 2008). Yet the truth is that flagging a and the resumptive weak form operate 
independently from one another, as the following data show. Indeed, in the poem, a 
marks the stressed personal pronouns and the human-referring proper names obliga- 
torily and is used optionally with human definite nouns, but none of these objects are 
cross-referenced if they appear in a sentence bearing no sign of dislocation: 


(6) Oi-d a mí, Álbar Fáñez | e todos los  cavalleros. 
listen-IMp Acc I Alvar Fañez and all the knights 


“Alvar Fañez, and all the knights, listen to me’ (v 616) 


And inversely, a resumptive pronoun tends to show up in dislocated structures, but 
if the topicalized element does not pertain to the class of direct objects that impose or 
attract flagging DOM, we find a coreferential pronoun without a, as exemplified by the 
non-specific human referents in (7a) and the inanimate entity in (7b): 


(7) a. los moros e las moras | vend-er non 
the moorishmen and the moorish.women sell-INF not 
los pod-remos 


they.acc.masc  be.able-FUT.1PL 
‘the moorish people, we won't be able to sell them’ (v 619) 


b. mas el castiello | non lo quier-o herm-ar 
but the castle not he.Acc  want-PRsisG —destroy-INF 


"but the castle, I don't want to destroy it' (v 534) 


Din modern spoken language, right-detached constituents are characterized by a number of defining 
prosodic features (Anagnostopoulou 1999: 765; Escandell-Vidal 2009: 852; Gabriel & Rinke 2010: 64-65). 

?Some decades ago, a strong hypothesis regarding the interaction between clitic doubling and prepositional 
DOM received expression in what came to be known as “Kayne’s generalization”, which stated that for an 
object noun phrase to be doubled by a clitic it had to be preceded by a preposition (Kayne 1975; Jaeggli 
1982). The hypothesis has been refuted on the basis of empirical data - doubling clitics appear alone (Suñer 
1988; Anagnostopoulou 1999; Leonetti 2008) — but it continues to raise expectations about potential co- 
occurrences of the two marking mechanisms. 
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Furthermore, cross-referencing pronouns are also found to interact, as in (8), with 
clausal object complements, which are never subjected to flagging DOM: 


(8) Ya lo ve-e el Cid | que del rey non 
now  itAcc see-PRS.35G the Cid comp ofthe king not 
av-ié gracia. 


have-IPFV.35G grace 


“Cid now knew it, that he was out of favor with the king’ (v 50) 


In light of these data, one is able to conclude that a and the coreferential pronoun in 
the Cid have different functions. Whereas prepositional DOM marks the higher-ranked 
objects with human reference, the pronoun, unrelated to DOM, appears in pragmatically 
marked topic constructions, where its principal function is to bind the dislocated object 
constituent to the core clause (Keniston 1937: 84; Nocentini 2003: 109; Real Academia 
Española 2010: 757). 

It is possible that the widely used dislocations in the Cid should be viewed as lingering 
traces of the oral tradition that is assumed to have given shape to the epic poem. What is 
certain is that the medieval texts posterior to the poem display an extremely scanty use of 
topicalizing constructions, and as a direct consequence of this decline in frequency coref- 
erential pronouns become equally rare. That is to say, a continues to extend downwards 
along the animacy hierarchy, but the objects are not cross-referenced since they are not 
dislocated. The perception that coreferential pronouns with (a-marked or unmarked) ob- 
ject constituents were not common during the post-Cid medieval period of Spanish is 
shared by all scholars who have dealt with this issue, in relation to flagging DOM (Laca 
2006; von Heusinger & Kaiser 2005), or from the angle of indexing DOM (Silva-Corvalán 
1984; Rini 1991; Fontana 1993; Eberenz 2000; Gabriel & Rinke 2010; Vázquez Rozas & Gar- 
cía Salido 2012). This does not mean that topic constructions with coreferential pronouns 
died out. Actually, they continue to be in use today, but they appear as infrequently as 
in the medieval texts (for some quantitative data on fronted objects in contemporary 
Spanish, see García-Miguel 2015: 215-216). 

So the question is whether the use of the coreferential pronoun in the examined con- 
structions paved the way for the rise of indexing DOM, associated with the second his- 
torical period of the Spanish language. It is tempting to motivate a link between the 
older use of the pronoun and the later development, in view of Givón's (1976) hypothe- 
sis about the rise of (subject and) object agreement markers as being due to a reanalysis 


10Worthy of note is the fact that the human definite objects, which in the Cid are optionally flagged, often 
occur in (left) dislocation structures accompanied both by a and the coreferential pronoun. When they are 
not topicalized, besides lacking the pronoun, of course, these objects are also more likely to appear without 
a (Melis 1995). A plausible explanation for this phenomenon is that the incipient use of a with these objects 
still depends, in some measure, on the establishment of the referent of the object as the pragmatic sentence 
"topic", and may therefore be viewed as a vestige of the beginnings of flagging DOM in the Romances (see 
81). The topicalizing phenomenon with the human definite objects in the Cid has been instrumental in 
creating the wrong impression that the preposition and the weak pronoun have worked jointly for DOM 
throughout the history of Spanish. 
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of anaphoric pronouns in “over-used” topic-shift constructions, meaning, in construc- 
tions where the pragmatic motivation for the marked word order had lost transparency. 
Under Givón's (1976: 156-157) proposal, afterthought-topic constructions (right disloca- 
tions) are likely to be particularly relevant to the development of object agreement (cf. I 
saw him, the man » I saw-him the man). 

It will be seen below (84.2) that Givón's hypothesis underlies some of the proposals 
that have been put forward to explain the rise of indexing DOM in Spanish. Nevertheless, 
as argued in Vázquez Rozas & García Salido (2012: 279), the envisaged scenario cannot 
be made to fit the Spanish data with ease, considering that the poorly documented topic 
constructions of medieval times do not evoke the "overuse" established as a condition 
for the reanalysis of the anaphoric pronouns. Additionally, it turns out that the examples 
appearing in the medieval sources (Vázquez Rozas & García Salido 2012: 279, with a 
reference to Riiho 1988) often display a dislocated subordinate clause cross-referenced 
with the neuter pronoun lo, as in (8) above. An object of this nature is not what we think 
of when defining the notion of topicality, nor does it in any way resemble the strong 
personal pronouns that will eventually attract the coindexing strategy. 

In short, the difficulty of tying indexing DOM immediately to the medieval disloca- 
tions is real, and everything seems to point in the direction of an innovative process 
of change, whereby the function of an available form - a coreferential pronoun - was 
expanded to satisfy a different purpose. 


4 The grammaticalization of Spanish indexing DOM 


4.1 Preliminaries 


The rise of Spanish indexing DOM can be traced back to the turn of the 16th century, 
when an increase in the use of a weak pronoun with a (necessarily) DOM-flagged strong 
object personal pronoun becomes noticeable (Keniston 1937: 83; Silva-Corvalán 1984; 
Rini 1990; Gabriel & Rinke 2010). This increase is the signal of a change in process that 
would culminate with the grammaticalization of object agreement in the pronominal 
domain, completed more or less by the end of the 17th century. 

The examples of doubled pronouns in (9) come from the CORDE materials examined 
for the purpose of this study, on which more will be said below (84.4): 


"The proposed date of completion varies. Some authors associate it with the end of the 16th century, while 
others detect non-doubled pronouns until the 18th century (Silva-Corvalán 1984; Rini 1990; Gabriel & Rinke 
2010; Vázquez Rozas & García Salido 2012). The discrepancy hinges on the nature of the data. As we will 
see below, the textual sources suggest that doubling was used with variable frequency depending on the 
individual speakers/writers. What is clear is that clitic doubling grammaticalized a bit more slowly with the 
third person pronouns than with the speech act participants (see $4.4 and footnote 17). My own perusal of 
data of the CORDE motivates my stating that instances of non-doubled pronouns are extremely rare after 
the 17th century. 


106 


4 Spanish indexing DOM, topicality, and the case hierarchy 


(9) a. Por cierto, que a mí me pesa mucho de su 
of course that par I IDAT grieve-PRs.3s56 much of his 
muerte. 
death 


“Of course, I very much lament his death? [lit. ‘it grieves me of his death’] 
(1555, Espejo, CORDE) 
b. Señor, ¿porqué me  d-ais cargo a mí? 
sir why IDAT give-PRS.2PL charge DAT I 
“Sir, why do you accuse me?’ (1517, Arderique, CORDE) 


c. El que a mí aquí me trux-0 no es el diablo 
The who acc I here LAcc bring-pvr.3sc not is the devil 
que diz-es [...] 
that say-PRS.2sG 
‘The one who brought me here is not the devil as you say’ (1504, Esplandian, 
CORDE) 


The examples exhibit strong pronouns that are collocated in different positions, with- 
out suggesting the presence of a recognizable dislocation. I found this to be true in the 
majority of cases.!2 

Certainly, fronted pronouns, as in (9a), are common, but they are selected by verbs 
with special characteristics like pesar ‘to grieve, to lament’, whose experiencer argument 
has always had a tendency to favor the sentence initial position (Melis & Flores 2013). 
During the period under study, the pronominal experiencer of these verbs is frequently 
preverbal including when it is not doubled (see example (18a) below). Other pronouns 
occupy the sentence final position, as in (9b), and may provoke ambiguity (perhaps a 
right dislocation), although nothing in their behavior differentiates them from the non- 
doubled tonic phrases which likewise appear at the right-most end (see example (14a) 
). The remaining pronouns occur in the middle of the sentence, as in (9c), and 
are impossible to confuse with a detached constituent. 

The doubled pronouns also differ as to their status in the information structure of the 
sentence. From the cognitive point of view, the entities coded in the form of personal 
pronouns are “prominent”, in the sense that the referent of the pronoun, beyond its pre- 
supposed condition of familiarity, is also the current center of attention of the speech 


below 


Some dislocated structures did show up. In the following example, a mí is detached to the left and the core 
clause begins with the subject pronoun uno: Por cierto a mi uno solo me perdió, mas yo he perdido a muchos 
“Actually, as far as I am concerned, only one fellow ruined me, whereas I ruined many” (1520, Ysopo, CORDE). 
For a right dislocation, see footnote 13 below. The dislocations were eliminated from my analysis. 

BA right dislocation was detected in a few examples, as in this one: porque Cortés me mostró la misma carta 
a mi y a otros conquistadores ‘because Cortés showed me the same letter, to me and to other conquistadors’ 
(1568-75, Historia, CORDE). The detached segment shows a coordinated structure in which the tonic pronoun 
appears together with a reference to another participant not evoked by the weak pronoun. This may be 
interpreted as an “update” of referential character aiming at rectifying — completing - the information 
given in the core clause, which is characteristic of right dislocations as discussed above. 
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participants (Anagnostopoulou 1999: 770; cf. Lambrecht 1994: 94). This explains the com- 
mon assumption that pronouns are topics. However, with respect to the information 
structure of a clause, pronouns may stand in different relations to the proposition, and 
may appear in the focus domain of an utterance, as part of the comment or as the sole 
constituent in focus (Lambrecht 1994: 128-130). 

In fact, since we are dealing with stressed pronouns, a focus status would typically 
be expected (Siewierska 2004: 183). But in Spanish the strong personal pronouns are not 
necessarily focal (Luján 1999), and in the data I examined, as it turns out, a clear focal 
interpretation imposed itself in a few cases only. The pronouns were always contrastive 
foci. (10) may serve as an example: 


(10) y el visorey  respond-ió: "Matar-me h-an si 
and the viceroy answer-Pvr.3sc  killimr-Lacc have-PRS.3PL if 
salg-o." Aliaga  dij-o: "Primero me ` matarán a mi.” 
leave-prs.isc Aliaga say-Pvr.3sc first LAcc killrur3PL acc I 


“and the viceroy answered: “They will kill me if I come out”. Aliaga said: “They 
will kill me first”? (1555-84, Guerras, CORDE) 


In most cases, the doubled pronouns from my textual sources function as pragmatic 
topics. Some of them invite to be characterized in terms of “secondary topics” (Nikolaeva 
2001). This analysis is suggested for pronouns occurring in a clause which “in addition 
to conveying information about the topic referents conveys information about the rela- 
tion that holds between them as arguments in the proposition” (Lambrecht 1994: 148). 
Consider (11): 


(11) A esta Luscinda am-é, qu-ise y ador-é desde 
ACC this Lucinda  love-prv.isc_ like-prv.isc and adore-prv.isG since 
mis tiernos y primeros años, y ella me ` qu-iso a 
my tender and first years and she.NoM Lacc like-PrFv.3sG6 Acc 
mí, con aquella sencillez y buen ánimo que su poca edad 
I with that simplicity and good heart that her little age 
permitia. 
allowed 


‘I loved, cherished and adored Lucinda since my early tender years, and she loved 
me with the simplicity and noble heart of her youth’ (1605, Quijote, CORDE) 


Other pronouns, in spite of being objects, must be viewed as encoding the entity the 
proposition expressed by the sentence is primarily about. Pronouns with a primary topic 
role, as in (12), are very common in my data and will be discussed below (§4.5): 
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(12) que nunca sent-imos rumor de gente y a mí me 
because never feel-pFv.ipL murmur of people and par I LpAT 
parec-ió que  deb-íamos sal-ir d-el pueblo de aquella 
seem-PFV.35G that have-IPFV.1PL leave-INF from-the village of that 
manera 
manner 


“because we never heard voices and I thought [lit. “it seemed to me”] that we had 
to leave the village that way' (1519-26, Cartas, CORDE) 


Space limitations prevent me from showing that the strong pronouns which do not 
undergo doubling while the change is in process display a similar panorama of distri- 
bution between topic and focus relations. The motivation for indexing DOM, in other 
words, does not seem to have depended on the pragmatic structuring of the utterances. 

Let us stop one moment to consider the functional shift which the coreferential pro- 
noun is experiencing in examples like the ones shown in (9) to (12). It is clear that binding 
a dislocated object phrase to the core clause no longer corresponds to the function it ful- 
fills. The coreferential person form now co-occurs with a strong pronoun within one and 
the same syntactic domain, as opposed to the situation of medieval times when in this 
type of pragmatically unmarked or “neutral” environments the weak object pronoun and 
the strong personal pronoun were in complementary distribution (me or a mí, not both). 

The functional shift should be thought of in terms of a gradual process. In the initial 
phase, the doubling pronoun must have been felt as redundant. Indeed, redundancy will 
play a crucial role in the hypothesis I will outline in $4.3. But the doubling pronoun 
eventually becomes categorical and gives rise to a phenomenon of coindexing on the 
verb which has come to be viewed as an instance of object-verb agreement (Sufier 1988; 
García-Miguel 1991; Bogard 1992; Fernández Soriano 1999; Franco 2000). Let us recall in 
this regard (82) that the grammaticalization of indexing DOM in the pronominal area 
unfolded in parallel to the cliticization of the Spanish weak pronouns, a development 
which makes it easy to defend the agreement analysis: the weak pronoun is morpho- 
logically bound to the verb, forming with the "target" (Corbett 1983) of the agreement 
relationship a phonological unit, without being attached to it like an affix.4 

In addition to enabling the grammaticalization of object agreement in Spanish, the 
conversion of the weak pronouns into clitics may have contributed to the relative swift- 
ness with which coindexed strong personal pronouns became the norm. The time span 
reflected by written materials, as mentioned, covers a period of more or less two cen- 


14 As already indicated, in the exceptional case of the imperative and non-finite verbal forms the Spanish 
clitics are suffixed; cf. háblale ‘talk to him’, quiere verme a mí ‘he wants to see me’. In fact, arguments 
have been advanced to justify the view that the Spanish clitics behave like inflectional affixes, on a par 
with the subject agreement suffixes appearing on the verb (Alarcos Llorach 1980; Bogard 1992; Fontana 
1993; Enrique-Arias 2003), but not everyone agrees with this analysis (Aijón Oliva & Borrego Nieto 2013, 
among others). The lack of consensus has much to do with the fact that the weak forms, in the majority of 
their occurrences, function as anaphoric pronouns encoding syntactic arguments. They are “ambiguous” 
(Siewierska 2004: 126) agreement markers in this sense. 
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turies. We will also see that clitic doubling was generalized at different rates depending 
on the individual speakers/authors, and in some cases very quickly (84.5). 

Two essential facts have to be kept in mind for a thorough understanding of Spanish 
indexing DOM, namely, that clitic doubling spread to accusative and dative tonics alike, 
and that once it became established with the personal pronouns it continued to evolve 
towards the nominal indirect object. 

Disregarding the second phenomenon, one could argue that indexing DOM was ex- 
tended to the dative object pronouns owing to the absence of formal case distinctions 
within the personal pronoun paradigm, as seen above (82). The neutralization of the split 
between accusative and dative would have taken place in accordance with the general 
properties of Spanish, a language in which the boundary between the two grammati- 
cal object functions is not sharp (García-Miguel 2015). From this perspective, one could 
then sustain that indexing DOM, irrespective of case considerations, and much like flag- 
ging DOM, began with the personal pronouns because the participants encoded by these 
forms are human, definite, and moreover highly prominent in discourse, all of which jus- 
tifies their superior ranking in the universal hierarchy of topicality. As to the question 
of why a new device was recruited to signal differential object features already marked 
with a, one could invoke the need for a “renewal” of DOM, in the sense that the clitic 
helped reestablish the original distinction between pronouns and non-pronouns which 
had existed before a was extended to some of the accusative nouns (81). 

The second phenomenon, however, forces us to modify these assumptions. Indeed, if 
the new marking had been fundamentally motivated by the pronominal features of ani- 
macy, definiteness and discourse prominence, one would have expected a development 
more in line with that of flagging DOM. Recall that a began as a topicalizer which did not 
differentiate either between accusative and dative pronouns. In its descent towards the 
non-pronouns, along the animacy hierarchy, a was directed at the more topical human 
and definite accusatives, and was not extended to the typically human and definite da- 
tives, because the Spanish datives were already case-marked with a homonym of flagging 
DOM (deriving from the locative uses of Latin ad). In the case of indexing DOM, how- 
ever, nothing prevented the clitic from moving along the same hierarchy to the whole 
range of more topical nouns, which would have included the datives and also many of 
the by then a-flagged accusatives. Instead, the clitic proceeded selectively, picking out 
the dative nominal with which it came to establish a systematic relation in the course of 
time (Silva-Corvalán 1984; Rini 1991; Melis & Flores 2009; Vázquez Rozas & García Salido 
2012). As a result of this expansion, clitic doubling is at present obligatory or strongly 
preferred in most dative contexts in all varieties of Spanish (Fernández Soriano 1999: 
1250), thus functioning as a perfectly entrenched indirect object agreement marker in 
the opinion of most scholars working on Spanish.” An example is given in (13): 


DT have to mention that in some dialects, most notably in Argentinian Spanish, the doubling clitic is some- 
times used with nominal direct objects. There have been different attempts at explaining the triggering 
conditions for the optional use of the clitic in these contexts, but the proposals suggest a total lack of agree- 
ment (see Belloro 2007 for a good overview of the divergent hypotheses, ranging from "topic" to "focus", 
and from “presupposed” to “new” entities, among other claims; cf. Sánchez € Zdrojewski 2013 for addi- 
tional references). What seems clear is that the regional phenomenon obeys principles of its own, different 
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(13) El padre Miguel le entreg-ó a Sole una pequeña 
the father Miguel she.DAT give-Prv.3s6 DAT Sole a small 
campana de bronce. 
bell of bronze 


‘Father Miguel gave Sole a small bronze bell’ (1999, González, Quién como Dios, 
CREA) 


In this example, the recipient argument Sole is introduced by the case-marking preposi- 
tion a, and the product of the modern extension of indexing DOM to the dative nominals 
is seen in the cross-referencing dative clitic le. Hence, within the nominal area, clitic dou- 
bling today also functions as a case marker, opposing datives (a + clitic) to accusatives 
(a or Ø). 

In his paper on the rise of object agreement, Givón (1976: 165) remarks on the ten- 
dency for dative to take precedence over accusative agreement in languages in which 
the accusative and dative objects are equally case-marked (or unmarked). And the au- 
thor also notes (Givón 1976: 169) that if the agreement system is allowed to mature, "the 
agreement primacy of one (mostly the dative) over the other becomes effectively the sig- 
nal differentiating the object cases from each other". On this view, Spanish would have 
evolved following a universal tendency. 

However, typological research carried out during the last few years has demonstrated 
that languages with indirective alignment like Spanish do not illustrate a situation in 
which the dative is indexed and the accusative is not (Haspelmath 2005: 12). So weighed 
against this new piece of evidence, the dative case marker Spanish developed through 
indexing represents “a typologically anomalous fact" (García-Miguel 2015: 232). 

To account for this anomaly, different explanations have been proposed. It is possible 
that dative indexing in Spanish arose as a means to promote oblique-like arguments to 
the level of core participants (García-Miguel 2015: 232-233). It has also been suggested 
that through the dative clitic a case distinction was reinforced in a language which has 
met with difficulty in keeping its two object categories apart (Melis & Flores 2009). What- 
ever the explanation, the point of major interest for this study is that the dative orien- 
tation of the subsequent development of indexing DOM in Spanish implies a distinction 
with strong ties to a concept of grammatical case functions. This property cannot be 
ignored when one tries to account for the emergence of Spanish indexing DOM. 


4.2 Previous approaches 


Before I present my analysis, a brief review of previous approaches to Spanish indexing 
DOM is in order. Under Silva-Corvalán's (1984) proposal, based on Givón (1976), Spanish 
clitic doubling shows a phenomenon of topic-verb agreement evolving into an object 
agreement marker, sensitive to the relative topicality of the object phrases. To sustain 


from those underlying standard flagging and indexing DOM, since the doubling clitic occurs rather freely 
with inanimate entities and allows for these to be devoid of a. 
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her proposal, the author observes that in the medieval texts, where - she acknowledges 
- doubling is scant, the objects that favor the occurrence of the clitic are fronted. From 
these topicalizing structures, the clitic spreads to the personal pronouns, located in the 
upper region of the universal hierarchy of topicality, and later moves on to the nominal 
indirect object, overwhelmingly human and definite, and in this sense more topical than 
the direct object, which, as Silva-Corvalán (1984) argues, tends to be non-human and 
indefinite. 

Although the proposal is attractive, I have already commented (83) on the difficulty 
of establishing a direct connection between indexing DOM and the sparingly used topic- 
shift constructions of earlier times. Another problem lies in the crucial dependence of 
this account on the topical features of animacy and definiteness. As indicated in the 
preceding section, if these features had been the driving force one would have expected 
clitic doubling to spread not just to the dative nouns but also to the similarly human and 
definite a-marked accusatives. Finally, Silva-Corvalán's (1984) hypothesis disregards the 
fact that clitic doubling in the pronominal area begins as a highly selective process that 
picks out a few personal pronouns only (see 84.5), thus making it evident that some 
additional factor beyond the shared topicality of the pronouns is at play. 

Gabriel & Rinke’s (2010) thesis — along the lines suggested by Givón (1976) - is that 
object agreement in Spanish derives from a reanalysis of the right-dislocated topic struc- 
tures of the medieval era. The authors work under the assumption that the coindexed 
objects of modern Spanish are "preferably construed as belonging to the focus domain" 
(Gabriel & Rinke 2010: 62), and argue that the proposed reanalysis is able to explain why 
topical participants, such as the personal pronouns and the human/definite dative nouns, 
occur in clitic doubling constructions in which they are assigned focus status. 

One weak point of this thesis relates to the presupposed focus interpretation of the 
doubled objects, which suits the dative nouns far better than it does the personal pro- 
nouns. Indeed, datives in Spanish are typically coded in the form of clitics, because 
their referents are prominent in the discourse, and when they appear as noun phrases 
they tend to (re)introduce “new” entities which are likely to form part of the comment 
(Vázquez Rozas & García Salido 2012: 286-287), but the personal pronouns, as seen in 
84.1, cannot be assumed to function as foci on a regular basis. The more important ob- 
jection to be raised, however, has to do with the choice of the historical data, strongly 
influenced by the focus thesis and represented by postverbal objects only (Gabriel & 
Rinke 2010: 75). It is clear that the skewed character of the sample must have seemed to 
lend support to the reanalysis of right dislocations, yet this was done at the expense of 
doubled objects occurring in other positions, which were simply left out of the study. 

Approaching Spanish indexing DOM from a different perspective, Rini (1991) takes the 
emphatic/contrastive property of the strong personal pronouns as his point of departure, 
and proposes that the use of the doubling clitic was developed as a means to compensate 
for the gradual loss of emphasis which the tonics had suffered over time. On this view, 
the duplicating element was recruited to form a construction with patently redundant 
attributes, the effect of which would be able to ensure the emphatic value of the strong 
pronominal object. As supporting evidence for the hypothetical loss of emphasis, Rini 
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mentions the growing tendency for the dative tonics to occupy the preverbal position 
starting from the 14th century (Rini 1991: 277). The preverbal datives are assumed to 
represent left dislocations (but recall example (9a) above), and are regarded by the author 
as instantiating an alternative strategy to clitic doubling, deployed for the same purpose 
of reinforcing the weakened emphasis of the strong personal pronouns. 

It will be seen below that I agree with Rini in giving importance to a notion of redun- 
dancy as a way of explaining the origin of Spanish indexing DOM. But I do not believe 
that a loss of emphasis was at issue. The gradual process oriented towards certain types 
of case roles, which will be analyzed below, suggests otherwise. The major problem here, 
however, has to do with the expansion of clitic doubling to the dative nouns. The author 
himself recognizes the challenge the diachronic evolution of the clitic poses to his the- 
sis "since NP duplication cannot be assumed to have ever been emphatic" (Rini 1991: 
282), and he is forced to speculate that the nominal indirect objects were submitted to 
doubling “by analogy" (Rini 1991: 282). 

To sum up, our survey of previous approaches to Spanish indexing DOM leads us 
to conclude that there remain important aspects linked to the development of this phe- 
nomenon which have not been fully elucidated. 


4.3 Hypothesis: the topic case hierarchy 


My analysis starts from the observation that the stressed object personal pronouns with 
which Spanish indexing DOM arises around the turn of the 16th century encode dis- 
course prominent referents — all are highly topical in this sense — and in addition share 
the same emphatic form that is indicative of the presence of a contrast drawn between 
the referent of the pronoun and other individuals. 

To answer the question of what may have triggered the innovative use of indexing 
DOM with the tonic personal pronouns, we first have to define the value the corefer- 
ential pronoun supplies to the construction. As discussed above in 84.1, the use of the 
coreferential pronoun in the innovative contexts no longer hinges on the occurrence of 
dislocated objects, and yet, at the same time, we are aware of the fact that the pragmati- 
cally marked structures of medieval times provide the single source of environments in 
which a prior use of coreferential pronouns was found. This gives us reason to turn to 
these structures to ask if something in the behavior of the coreferential pronoun might 
explain the new function it came to develop with the strong object personal pronouns. 
One good justification for this is that processes of change have been described as being 
“economical”, to the extent that diachronic changes have a tendency to seize upon exist- 
ing forms in a language, which are reused for new purposes (Hopper & Traugott 2003: 
73). 

My claim is that the relevant property we are looking for lies in the fact that the dis- 
located sentences of medieval times are characterized by the double mention of one and 
the same referent. Although it is evident that the double mention is syntactically moti- 
vated, it is no less true that it has the effect of enhancing the prominence of the topic 
participant in question. From this perspective, one may therefore suggest that the dou- 
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bling clitic was introduced in the area of the strong personal pronouns as a means of 
drawing special attention to a referent by mentioning it twice, i.e. through redundancy. 
As Pulgram (1983: 41) cited by Rini (1991: 281) points out, in any of its forms a redundant 
construction “aims at a kind of greater explicitness, emphasis, preciseness”, without nec- 
essarily providing a clue as to what is being emphasized. 

Defining the communicative intention of a redundant construction calls for a detailed 
examination of the contexts in which it occurs, to be compared with the use of analogous 
expressions lacking the redundant element. The comparative analysis of doubled and 
non-doubled pronouns to be presented in $4.5 will put in evidence that the "redundant" 
coreferential pronoun was initially used in specific discourse contexts where it served 
the purpose of emphasizing the subjective involvement of the personal pronoun referent 
in the designated verbal situation. 

This original bias towards the role ofthe object participant in the event structure estab- 
lishes the scenario for the development of indexing DOM in Spanish, from the emergence 
of its use with the personal pronouns until its conversion into a case marker with the lex- 
ical indirect object. As mentioned in 81, the development at issue directs us to a specific 
dimension of topicality, according to which the property of being "topical" is evaluated 
with regard to the degree of involvement of a participant (more involved participant > 
less involved participant) and functions as the parameter governing the case hierarchy 
(agent » dative » accusative) (Givón 1976). 


4.4 The data 


For the purpose of my study, a corpus of data was formed with examples retrieved from 
the electronic data base CORDE. Since the rise of Spanish indexing DOM is associated 
with the transition decades between the 15th and the 16th century, the first materials 
I reviewed were several works of the late 15th century. The sample included the play 
La Celestina. Tragicomedia de Calisto y Melibea by Fernando de Rojas (1499-1502) and a 
number of narrative and historical texts produced between the years 1480 and 1499. The 
comparison between textual sources made plain that doubling pronouns in the Celestina 
were more frequent than in the materials which did not intend to reflect “oral” produc- 
tions to the same degree. To give an example, the first person pronoun a mí *me' in the 
Celestina was doubled in almost 60% of its occurrences (26/44), as opposed to 26% of du- 
plications (25/97) in the narrative texts and 16% of doubled a mí (11/67) in the historical 
works. These results suggested that the innovative use of the coreferential pronoun - 
like most changes - got a firm footing in spoken language before finding its way into 
writings. On the basis of these results, and given my interest in exploring the beginnings 
of indexing DOM, the decision was made to keep the more conservative texts for this 
research. Confronting theater plays with variable doubling would have been another 
possibility, of course, but those available from the CORDE for the period under study 
showed indexes of frequency similar to, or higher than, the percentages of the Celestina. 

The doubling data from the Celestina also seemed to suggest that indexing DOM with 
the pronouns began as a type of ego-centric strategy (or one involving the speaker/hearer 
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dyad, but second person tonics were too scant to appreciate this). Thus, in comparison 
to a mi, doubled in 60% of the examples, the third person singular pronoun (a él ‘him’ / 
a ella her”) only yielded 30% of duplications (7/23). These results motivated the decision 
to organize the study around the first person pronoun which had played a leading role 
in the process of change. Dealing with a single form, to my mind, offered the additional 
advantage that a more homogeneous picture of the development of indexing DOM could 
be obtained, by excluding potential variables connected to the distinct persons. 

In this way, the definitive corpus of data came to consist of 794 tokens of (doubled and 
non-doubled) a mí, extracted from 14 narrative and historical works covering the period 
between 1482 and 1605. In Table 2 I show the distribution of clitic doubling associated 
with each of the texts I examined. It needs to be noted that the quantitative profiles 
vary considerably in spite of the temporal proximity of the texts. We may interpret the 
differences as reflecting individual preferences, which are not unusual when a change is 
in progress. At the same time, we cannot ignore the fact that, if viewed as a whole, the 
texts project the image of a rather quickly unfolding change. 

For the sake of my analysis, the textual sources were divided in three sets according 
to the indexes of duplication: no more than 30% of duplication, around 50%, and a near 
categorical phenomenon of clitic doubling with the first person. 


4.5 Analysis and discussion 


I have advanced the hypothesis that indexing DOM in Spanish originated as a means 
to give the highest degree of prominence to a referent's subjective involvement in the 
action. This is achieved through a strategy of double mention, whose redundant value 
is exploited to create the desired emphasis. In order to verify the hypothesis, the textual 
sources belonging to the first set will be examined. They can help us track the beginnings 
of indexing DOM since doubling in these works is still exceptional. We will proceed by 
having a look at several pairs of examples. 
The first pair is shown in (14): 


(14) a. Y si culpa tiene Fortuna, no la pongas a mí. 


And if Fortune is to blame, don't put the blame on me” (1495, Grimalte, 
CORDE) 


b. Porque entonces era enemigo queriendo cobrar de ti aquello que ya cobré, cuya 
causa a mí me puso descanso y a ti estos sospiros que tienes. Y si lloras lo que 
conmigo perdiste, yo asimesmo lo que contigo gané. 


"Because at the time I was your enemy, wanting to get from you that which I 
finally got, an outcome that gave me peace [lit. “put peace on/to me”] but left 
you with these sighs. And if you bemoan what you lost with me, I cry all the 
same over what I won with you’ (1495, Grimalte, CORDE) 


16 Additionally, as suggested by the anonymous reviewer, the variation in terms of doubling frequencies may 
also be due to the involvement of distinct textual traditions in the examined sources. This is a question of 
great interest, which, unfortunately, lies beyond the scope of this paper. 
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Table 2: The distribution of clitic doubling with the personal pronoun a mí 


DOUBLING 

DATE TEXT REGISTERED TOKENS NUMBER PERCENT 
LOW INDEX OF DOUBLING 
1482-92 Amadís 91 27 30 
1495 Grimalte 58 6 10 
1501 Tristán 73 7 10 
1520 Ysopo 47 12 26 

AVERAGE 269 52 19 
INTENSE COMPETITION 
1504 Esplandián 39 18 46 
1516 Floriseo 46 25 54 
1517 Arderique 47 21 45 
1555 Espejo 99 59 60 
1560 Crónica 23 14 61 

AVERAGE 254 137 54 
GENERALIZED DOUBLING 
1519-26 Cartas 41 37 90 
1553-84 Guerras 33 30 91 
1568-75 Historia 74 68 92 
1595 Granada 27 24 89 
1605 Quijote 96 92 96 

AVERAGE 271 251 93 


In both examples a mí functions as the dative argument of poner “to put something on 
someone”, and in both cases the choice of the strong pronominal has been motivated by 
the expression of a contrast (Fortune vs. me, me vs. you). The non-doubled use in (14a) 
represents the normal way of encoding the object pronoun at the time. By comparison, 
the context in which (14b) is inserted contains a far more elaborate opposition between 
the speakers personal memories of a bygone love and the experience of the beloved one. 
In this context, the redundant function of coreferential me is called upon to center the 
attention on the subjective experience of the speaker. 

The utterances in (15) are produced by the same character of the textual source, a 
rejected lover. 


(15) a. Mas esto a mí acaescer no puede, segunt el precio que ya me costaes y aún no 
sois mía. 


“But this cannot happen to me, since you've already costed me a fortune and 
you are still not mine’ (1495, Grimalte, CORDE) 
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b. y así como aquellos que por faltas suyas vergongosos buelven a sus tierras, tal a 
mí me acaesció, que con menos favor que partí me buelvo a los reinos dEspaña 
y castellana tierra donde yo natural era. 


“and like those who due to errors of their own return to their homeland with 
shame, so it happened to me, who returns to the kingdom of Spain and my 
native Castile having much less in my favour than when I left’ (1495, 
Grimalte, CORDE) 


(15a) follows a statement as to the fact that people easily let go of things that were 
easily obtained, and opposes the situation of the speaker, who cannot give up something 
that is still not his. The event alluded to in (15b) is more tragic: The speaker returns from 
a failed mission knowing that the woman who rejects his advances has conditioned a 
potential change in her attitude on the successful outcome of the assignment she herself 
imposed. The double-mention strategy in this example serves to emphasize the feelings 
of shame and despair which underlie the comparison with other defeated individuals. 

Now consider (16): 


(16) a. Suplico ante tu excelente majestad que otorgues a mí, tu servidora, esta gran 
merced 


‘Appearing before your excellent majesty I beg you to grant me, your servant, 
this great favour' (1520, Ysopo, CORDE) 


b. Que si Dios a mí de sus gracias alguna parte me diera, yo soy cierto que vos ya 
fuérades mía 


“If God had given me a fraction of her [Fiammetta's] talent, I am certain that 
you would be mine by now” (1495, Grimalte, CORDE) 


The non-doubled tonic occurs in a petition addressed to Jupiter, where the contrastive 
value of the pronoun is used to emphasize the distance that separates the humble peti- 
tioner from the king of gods. In the emotionally charged context of (16b), on the other 
hand, a doubled tonic surfaces. The speaker is the rejected lover of (15), who in this pas- 
sage laments his not having been blessed with the gift of eloquence, another condition 
imposed by the beloved for her to yield to his advances. This explains both the compari- 
son with Fiammetta, who does possess the gift, and the use of the redundant construction 
as a means of underscoring the fatal shortcoming that condemns the speaker to a life 
away from the woman he loves. 

The verb parecer “to seem' is involved in the following choice between uses: 


(17) a. jPor Dios -dixo Gorvalán-, a míparesce locura en querer probar todas las aven- 
turas! 


“For God's sake -Gorvalan said- it seems madness to me wanting to have a 
taste of any kind of adventure!” (1501, Tristán, CORDE) 


b. En el nombre de Dios -dixo el Cavallero de la Verde Spada-, ésse me parece a mí 
el mejor acuerdo, porque, ahunque el Emperador sea mayor que vos, y tenga más 
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gentes, para doze cavalleros tan buenos se fallarán en vuestra casa como en la 
suya. 


“In the name of God -the Knight of the Green Sword said- this seems to me 
the best resolution, because, although the Emperor is older than you, and has 
more troops, for a fight with twelve knights you'll find as good ones among 
yours as he among his” (1482-92, Amadís, CORDE) 


(17a) and (17b) communicate a personal state of mind with respect to a proposal set 
forth by the interlocutor. In the lines preceding (17a) Tristan expresses his desire to go 
and rescue a noblewoman in distress, to which the speaker opposes his contrasting view 
on the matter with a simple a mí. In (17b), the king's project to war against the twelve 
knights ofthe emperor motivates a fully supportive (“the best resolution") and elaborated 
upon (“because...’) response, in which the doubling form brings additional emphasis to 
the degree to which the speaker approves of the decision for war. 

My last examples are constructed with the verb placer 'to please, to like', which in cer- 
tain types of contexts comes closer to expressing a notion of will. This is especially true 
in dialogues where placer communicates the speaker's consent to a request or agreement 
with a proposal, and where, depending on the case, slightly different shades of meaning 
may emerge (‘it pleases me’, ‘I want to’, ‘it is my will’, ‘I agree”, etc.). In such environ- 
ments the stimulus argument is often omitted, being recoverable from the context: 


(18) a. E dixo Tristán: -A mí plaze. 
‘And Tristan said: “It pleases me”? (1501, Tristán, CORDE) 


b. Eel rey dixo: -A mí me plaze, e fago gracias a Dios de tamaña merced como me 


á fecho. 
“And the king said: “It pleases me, and I thank God for doing me this great 


D 


favour”? (1501, Tristán, CORDE) 


The sentence with the non-doubled pronoun is an expression of agreement with a 
travel mate's proposal to split up and go separate ways. (18b) is the king's response to a 
request for his daughter's hand, occurring at the end of a dialogue in which the father 
reiterates his consent, as well as his delight in the thought that his daughter will marry 
Tristan. The redundant construction contained in the response is a way of emphasizing 
the speaker's internal state of profound happiness. 

The examined pairs of examples have given us insight into the communicative strat- 
egy of redundancy which lies at the root of Spanish indexing DOM. As is expected to 
happen at the early stage of a grammaticalization process, the innovative function of the 
doubling clitic is appealed to in specific discourse contexts, here suggestive of a search 
for greater expressivity or emphasis regarding the involvement of a participant in the 
denoted event. Following Haspelmath (1999: 1057), we could say that the emergence of 
Spanish indexing DOM illustrates the *extravagance maxim" characteristic of the actions 
of speakers who “want their utterance to be imaginative and vivid". What is easier to 
understand after the examination of the examples is why the strong personal pronouns 
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were good candidates to trigger the new strategy. They were indeed emphatic forms, 
which in themselves implied that a personal attitude or behavior would be brought to 
stand out through the means of a contrast, and this is precisely what made them eligible 
to become the targets of some additional emphasis. So even though one can never ex- 
plain why a change takes place, it is possible to state that Spanish indexing DOM arose 
in contexts where the contrastive value of the strong pronouns and the emphatic aim of 
the redundant construction fused in a natural and harmonious way. 

If my proposal is on the right track, it should receive support from the evolutionary 
path of the clitic. As a change progresses, an increase in the frequency of the new form 
is detected, and coupled with this increase certain patterns of use become visible. The 
choice of the new form over the older one loses its dependency on specific discourse 
contexts and acquires some systematicity, meaning that certain types of contexts now 
motivate the appearance of the new form on a regular basis. In order to verify this, the 
corpus texts pertaining to the second set may prove useful, since the extension of clitic 
doubling to one half of the registered examples profiles a movement towards the consol- 
idation of indexing DOM. 

As it happens, the distribution between doubled and non-doubled a mí in the texts 
under discussion affords a clear pattern, which resides in the near obligatoriness of the 
clitic with one particular verbal class, namely, mental predicates specialized in denoting 
a subjective attitude, whether intellectual or emotional. Thus, the tonic pronoun with 
parecer (‘it seems to me, I think”) is doubled in almost all of its occurrences (28/29 = 
96.5%), while placer (“it pleases me, I like’) and its antonym pesar (it grieves me, I lament’) 
motivate the duplication of a mí in 83% (15/18) of the registered examples. There are 
also complex predicates that convey similar meanings (ser oscuro 'it is obscure to me, I 
don’t understand”, causar pena ‘it causes me grief, I am sorry”, dar contento “it gives me 
happiness, I am happy’, caer en gracia ‘it strikes me as funny, I am amused’, etc.), and 
they too trigger doubling with high frequency (16/20 = 80%). 

All these mental predicates take a dative experiencer argument, and are construed, 
as is usual, with a stimulus of inanimate reference, coded in the form of a noun phrase 
when designating some object (cf. something pleases me) or appearing as a clausal com- 
plement when expressing a situation (cf. it pleases me that...). In this way, the sole hu- 
man participant to go on stage is the dative experiencer (a mí), highly salient, whose 
subjective attitude with regard to some entity or event is the focus of the utterance. Span- 
ish experiencers of this type are associated with a series of peculiar features that have 
prompted their analysis in terms of "dative subjects" (see Melis & Flores 2013, and refer- 
ences therein). Their subject-like behavior comes as no surprise considering that mental 
meanings of analogous nature are often expressed, in Spanish and in other languages, 
with nominative-experiencer predicates. 

It makes sense that indexing DOM grammaticalized first with these mental predicates, 
having moved along a path that leads from a redundant emphasis on one's subjective 
involvement in a situation to a class of verbs specialized in the description of one's subjec- 
tive mental state. The predicates in question also confirm that the doubling clitic was tied 
to a notion of participant roles ever since it was introduced into the domain ofthe strong 
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personal pronouns. This can be inferred from the character of the predicates' experiencer 
argument. Experiencers never perform like volitional agents. Yet mental experiences can 
be construed from different vantage points, and in some of these construals the internal 
process appears to be under the control of the experiencer. The mental predicates under 
discussion are of this type: they do not express the reaction of an experiencer to the im- 
pact of a stimulus, but portray a subject-like dative experiencer as being in a state with 
respect to a given object. Hence, in the case hierarchy (agent » dative » accusative) pro- 
posed by Givón (1976: 152), the dative experiencer of these predicates would be placed 
near the top-end (no agent but subject-like). And in light of this, one is able to argue 
that Spanish indexing DOM first spread to these experiencers because they were more 
"topical" than all the other object pronouns implicated in the change. 

It is now worth examining the behavior of the less topical objects in the texts of the 
second set. These objects occur in sentences containing another human participant who 
realizes the action and functions as the topical subject. So the case hierarchy predicts 
that doubling with these "less involved" object participants should lag somewhat be- 
hind, as the data corroborate. Additionally, the case hierarchy leads us to expect that 
the higher-ranked datives should motivate the use of the clitic more often than the ac- 
cusative pronouns. But the data are less transparent in this regard for one obvious reason: 
the distinction between more and less involved participants was neutralized due to the 
formal identity of the pronouns. 

The less topical objects were found to display percentages of doubling hovering around 
50%, irrespective of the dative/accusative distinction. To investigate the dative function, 
I gathered the verbs of “giving” (primarily dar ‘to give’, but also otorgar “to grant’, ofrecer 
“to offer”, encomendar ‘to entrust’, etc.) and the verbs of “saying” (decir ‘to say”, contar 
“to tell”, pedir “to ask”, prometer “to promise', mandar 'to order”, etc.), with which the ref- 
erent of a mí is semantically speaking a "recipient". Taken together, these verbs yielded 
duplicated tokens of a mí in 52% of the examples (28/54). Curiously, when viewed as sep- 
arate verb types, a striking disparity as to their behavior emerged: 79% of duplications 
(11/14) with verbs of "saying", against 42.5% (17/40) with verbs of “giving”. The elevated 
percentage in the former case would probably need some tuning given the numerical 
poverty of the sample. In the latter case, the low percentage may be related to the fact 
that some of the sentences built with a verb of “giving” (dar la muerte 'to kill", lit. “to 
give death’, atribuir la culpa ‘to blame’, lit. “to attribute a fault’, etc.) have a dative coded 
argument whose semantic role comes closer to that of a patient. This does not happen 
with the verbs of "saying", always accompanied by a dative who participates in the com- 
pletion of the event by processing the received message. So it is possible after all that 
the discrepancy between "saying" and "giving" verbs with respect to the frequency of 
doubling may reflect the operation of an underlying scale of degrees of involvement. 

The accusative population of a mí, on the other hand, is associated with a rather het- 
erogeneous set of verbs (ver ‘to see’, engañar “to deceive’, matar ‘to kill’, librar ‘to free’, 
traer ‘to bring’, buscar ‘to look for’, etc.), which does not offer the opportunity of inspect- 
ing the behavior of particular subclasses given the meager representation of the distinct 
event types. Globally, the accusative pronouns attract clitic doubling in 45.5% of the reg- 
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istered examples (36/79). A more fine-grained contextual analysis would be necessary 
to uncover why some patients were judged to be better candidates for doubling than 
others. 

In the next step of the grammaticalization process, the distinction between more topi- 
cal and less topical pronouns becomes obliterated, allowing for the spread of the clitic to 
all tokens of a mí as a near to obligatory object agreement marker. This is the situation 
which the textual sources of the third set bring to view. Eventually, indexing DOM will 
be extended to the entire category of the strong personal pronouns, marking datives and 
accusatives alike." 

The lack of formal case distinctions within the domain of the Spanish personal pro- 
nouns has to be viewed as the principal reason for why the accusative pronouns were 
drawn into the orbit of the grammaticalization process. If we understand this, the fol- 
lowing historical events related to clitic doubling in Spanish fall into place: The control 
exercised by the topic case hierarchy over the progression of the clitic recovers visibil- 
ity and propitiates the development of the object agreement device into a case marker 
reserved for the dative lexical nouns. The datives are the obvious targets, because they 
rank above the accusative objects in the case role hierarchy. 

From this point of view, the question of how the strong object personal pronouns 
became subjected to a second type of marking can also be resolved. Although the co- 
occurrence of two mechanisms, on first sight, might suggest a case of useless overlap- 
ping, the truth is that a and the doubling clitic complement each other. Both have been 
motivated by a factor of topicality, but the dimensions involved are not the same. Old 
flagging DOM signals the prominence of the personal pronouns on the animacy scale; 
it is sensitive to their semantic properties. Newer indexing DOM is concerned with de- 
grees of involvement in relation to the case hierarchy; it evaluates a participant's role in 
the event structure. This justifies the association of the Spanish personal pronouns with 
two types of DOM.!$ 


The historical data make clear that the development of the clitic into a near categorical object agreement 
marker took some more time with the third person pronouns. For example, in Hernán Cortés” Cartas, where 
a mí is accompanied by a doubling clitic in 90% of the examples (Table 2), the third person pronouns show 
57% of duplications (38/67), and in Cervantes” Quijote, one century later, a mí yields 96% of agreement 
(Table 2), against a frequency index of 77% (44/57) in the third person area. In order to verify the later 
entrenchment of the third person clitic, I reviewed a sample of narrative and historical texts, dating from 
the years 1660 to 1699. My sample showed 98% of clitic doubling with a mí (130/133) and 79% with the third 
person pronouns (162/205), thus confirming that these were lagging slightly behind. Curiously, “us”, “you” 
and “you all” were found to behave much like the third persons (66/86 of doubling = 77%). So it appears that 
the grammaticalization process of indexing DOM was from beginning to end somewhat biased towards the 
highest-ranked entity on the topic person hierarchy (ego). 

The problem of defining the semantic import of Spanish indexing DOM has been addressed in the liter- 
ature. On the whole, scholars have been especially concerned with offering an account that may serve 
to differentiate the contribution of the clitic from that of the animacy-related preposition a. But no agree- 
ment has been reached. Thus, for some, the clitic is supposed to encode the semantic feature of "specificity" 
(Suñer 1988) or “definiteness” (von Heusinger & Kaiser 2003; Leonetti 2008). From another perspective, the 
doubling form is associated with a condition of discourse “prominence”, for which the notions of both 
familiarity and activation are relevant (Anagnostopoulou 1999; cf. von Heusinger & Onea Gáspár 2008). 
And it is also viewed as a mechanism that simply serves to emphasize the heightened topicality of DOM 
marked objects (Escandell-Vidal 2009). My proposal seeks to throw new light on this question. 
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5 Conclusions 


The present study has dealt with a DOM language whose strong object personal pro- 
nouns bear two obligatory markings: they are flagged with the preposition a and are 
indexed on the verb by means of a clitic pronoun. Spanish flagging DOM, which goes 
back to the recorded beginnings of the language, has been thoroughly investigated in 
both diachronic and synchronic works. Indexing DOM, also known as clitic doubling 
(and DOI under Iemmolo's (2014) proposal), is the product of a later development traced 
to Renaissance Spanish. It has received less attention in the literature and has been the 
focus of this paper. 

One common assumption underlying the approaches to phenomena of differential ob- 
ject marking in the languages of the world is the idea that the development of these 
marking systems proceeds under the guidance of a handful of universally operating hi- 
erarchies. However, this assumption has recently been challenged by Bickel & Witzlack- 
Makarevich (2008), who invite us to consider the possibility that different systems of 
DOM might originate from individual, highly specific, and non-comparable diachronic 
changes. What the history of the two types of Spanish DOM suggests, as I want to show 
in my conclusions, is that room should be allowed for both scenarios. 

Thus, starting with the origin of the two marking devices, it is clear that we are being 
directed to familiar discourse-pragmatic strategies of cross-linguistic character. Flagging 
a begins as a topicalizer, while the (future) indexing clitic of DOM, in the form of a more 
independent coreferential pronoun, emerges in topic-shift constructions where it binds 
dislocated objects to the core clause. 

Both trajectories are also closely tied to the personal pronouns at the beginning stage. 
A distinction between pronouns and non-pronouns is a well-attested tendency in differ- 
ential marking systems (Comrie 1989: 195). It reflects the way in which language users 
tend to conceive of the participants coded in the form of personal pronouns as more 
worthy of being talked “about”, so that the pronouns naturally come to occupy the up- 
per regions of the universal hierarchy of topicality. In the case of flagging DOM (Pensado 
1995b), the pronominal connection is visible at the onset (late Latin and early Romance), 
when a topicalizes the object pronouns of first and second person. With indexing DOM, 
the connection is established as soon as the clitic starts to develop its differential marking 
function in Renaissance Spanish. 

How the clitic acquires this function is the result of a particular diachronic change, 
not susceptible of being cross-linguistically generalized or at least not expected to allow 
for such enterprise. Without entering into the details of the study presented in this pa- 
per, suffice it to say that the functional shift experienced by the coreferential pronoun is 
achieved through the means of a purposefully redundant construction, used to empha- 
size the subjective involvement of the pronominal referent in the denoted situation. 

Beyond the peculiarity of this change, the evolutionary paths of both types of DOM 
bring us back to hierarchies of universal scope. On one side, flagging a, linking up with 
the human and definite features of the topical pronouns, begins its descent along the 
animacy hierarchy and grammaticalizes into a nearly obligatory marker with all direct 
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objects of human reference. 

The grammaticalization process of indexing DOM, on the other side, evidences the 
influence of one of the hierarchic relations involved in the definition of what it means to 
be topicworthy. Topicworthiness in this case hinges on an underlying concept of agen- 
tivity and ranks the discourse participants along the hierarchy of semantic case roles 
in accordance with the degree to which the participants contribute to the event. Ob- 
serve that the specific evolution of indexing DOM has been anticipated in the use of 
pragmatic redundancy for the purpose of highlighting the subjective involvement of the 
twice-mentioned participant. It is this original concern with role issues that predisposes 
the doubling clitic to become sensitive to the case hierarchy. The control exercised by 
the case hierarchy on Spanish indexing DOM is perceived during the expansion period 
of the grammaticalization process, via the early entrenchment of the clitic with the more 
topical subject-like datives; it loses transparency with the extension of the clitic to all 
the strong object pronouns regardless of their dative or accusative role (propitiated, as 
I suggested, by the lack of formal case distinctions within the Spanish pronominal sys- 
tem); and it again becomes visible when the clitic is introduced into the nominal area of 
the more topical datives to develop a case-marking function that separates the higher- 
ranked dative participants [+ clitic] from the lower-ranked accusative object nouns [- 
clitic]. 

From this perspective, it is easier to understand why the strong object personal pro- 
nouns carry double marking. Flagging DOM interacts with the semantic properties of 
animacy and definiteness, whereas the relevant criterion for indexing DOM is the role 
of the participant in the event structure. The topicworthiness of the personal pronouns 
is thus simultaneously evaluated on two separate dimensions. 
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Royal Academy of Spanish, accessible through, http://www.rae.es 

CORDE Electronic data base Corpus diacrónico del español of the Royal 
Academy of Spanish, accessible through, http://www.rae.es 


Amadís Garci Rodríguez de Montalvo, Amadís de Gaula, 1482-92 
Arderique ` Juan de Molina, Libro del esforzado caballero Arderique, 1517 
Cartas Hernán Cortés, Cartas de relación, 1519-26 

Crónica Francisco Cervantes de Salazar, Crónica de la Nueva España, 1560 
Espejo Diego Ortúñez de Calahorra, Espejo de príncipes y caballeros, 1555 


Esplandián Garci Rodríguez de Montalvo, Las sergas del virtuoso caballero 
Esplandián, 1504 
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Floriseo Fernando Bernal, Floriseo, 1516 

Granada Ginés Pérez de Hita, Guerras civiles de Granada, 1595 

Grimalte Juan de Flores, Grimalte y Gradisa, 1495 

Guerras Pedro Cieza de León, Las guerras civiles peruanas, 1553-84 

Historia Bernal Díaz del Castillo, Historia verdadera de la conquista de la N. 
España, 1568-75 

Quijote Miguel de Cervantes Saavedra, El ingenioso hidalgo don Quijote de la 
Mancha, 1605 


Tristán Anonymous, Tristán de Leonís, 1501 
Ysopo Anonymous, Vida de Ysopo, 1520 
Abbreviations 
1 first person INF infinitive 
2 second person IPFV | imperfective 
3 third person masc masculine 
ACC accusative NOM nominative 
COMP complementizer PFV perfective 
DAT dative PL plural 
FEM feminine PRS present 
FUT future REFL reflexive 
IMP imperative SG singular 
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From suffix to prefix to interposition via 
Differential Object Marking in 
Egyptian-Coptic 


Eitan Grossman 


The Hebrew University of Jerusalem 


This article argues that Differential Argument Indexing (DOI) and Differential Argument 
Marking (DOM) constructions in Coptic (Afroasiatic, Egypt) are reanalyzed, resulting in 
a set of verbs with interposed P-indexes within bipartite stems (DeLancey 1996; Nichols 
2003). Basically, incorporated noun phrases with prefixed possessor indexes become parts 
of derived verbs with unpredictable lexical semantics, and their erstwhile possessor prefixes, 
entrapped within the derived verb, are reanalyzed as P-interpositions. Since this possessor 
prefix ultimately developed from an earlier possessor suffix, the pathway documented here, 
stripped down to its essentials, is SUFFIX — PREFIX — INTERPOSITION, and erstwhile com- 
plex construction — BIPARTITE STEM. Finally, an overt genitive prefix that marks lexical 
possessors of incorporated noun phrases is reanalyzed as an accusative case prefix. These 
changes introduce new complexity into Coptic Differential Argument Marking: not only 
are P arguments either indexed as suffixes, case marked, or incorporated for the majority of 
verbs, they can be indexed as interpositions for a lexically determined set of verbs. 


1 Introduction 


In recent years, Differential Object Marking (DOM) has been distinguished from Differ- 
ential Object Indexing (DOI) (Iemmolo 2011), but both fall under the generalized defi- 
nition of Differential Argument Marking proposed by Witzlack-Makarevich & SerZant 
(this volume), i.e. "Any kind of situation where an argument of a predicate bearing the 
same generalized semantic role (or macrorole) may be coded in different ways, depend- 
ing on factors other than the argument role itself? Under this definition, as Witzlack- 
Makarevich & Serzant point out, “DAM is not restricted to case marking (also called 
dependent marking or flagging [...] but also includes differential agreement (or head 
marking or indexing)" However, since some languages have both DOM and DOI, the 
two can interact, sometimes in complex ways. 
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The aim of this article is to show one way that DOM and DOI can interact in language 
change. It is argued that for a number of verbs, the specific constructions implicated in 
both DOM and DOI in Coptic (Afroasiatic, Egypt) are reanalyzed, resulting in the re- 
analysis of a prefixed possessor index as an interposed P-index within a bipartite stem.! 
Bipartite stems, described by Jacobsen (1980), DeLancey (1996), and Nichols (2003) for 
some North American and Nakh-Daghestanian Caucasian languages,? are defined by 
Nichols (2003: 321) as “a segmentable simplex stem; or a stem with inflection positioned 
so as to split the stem into two parts” The term interposition is used to characterize the 
person index that occurs between the two pieces of a bipartite stem, and the term “in- 
terposed” is used to describe its position of occurrence. Interpositions are distinguished 
from infixes, which “occur inside of a simple stem, where [their] position is usually de- 
fined phonologically” (Nichols 2003: 321). 

The coding of P arguments in Coptic involves both DOM and DOI, since a lexical P 
argument can be either overtly case marked or incorporated into the verb (DOM), or 
can be indexed on the verb as a suffix (DOI). For the vast majority of transitive verbs, 
both case marking and incorporation of P are in complementary distribution with P- 
indexing. However, for a lexically-determined set of verbs, incorporated noun phrases 
with prefixed possessor indexes become parts of derived verbs with unpredictable lexical 
semantics, and their erstwhile possessor prefixes, entrapped within the derived verb, are 
reanalyzed as P-interpositions. Since this possessor prefix ultimately developed from an 
earlier possessor suffix, the pathway documented here, stripped down to its essentials, is 
SUFFIX — PREFIX — INTERPOSITION, and erstwhile complex construction — BIPARTITE 
STEM. Finally, an overt genitive prefix that marks lexical possessors of incorporated noun 
phrases is reanalyzed as an accusative case prefix. 

All in all, these changes introduce new complexity into Coptic DAM: not only are P 
arguments either indexed as suffixes, case marked, or incorporated for the majority of 
verbs, they can be indexed as interpositions for a lexically determined set of verbs. 

The structure of this article is as follows. $2 presents the basic problem dealt with here. 
83 describes some background about the marking of grammatical relations in Coptic. 
84 presents some basic facts about the synchrony and diachrony of possessive phrases 
in Ancient Egyptian-Coptic, tracing the replacement of suffixed possessor indexes by 
prefixed possessor indexes. 85 shows how prefixed possessor indexes are reanalyzed as 
infixed P indexes. $6 suggests that this process, alongside the well-known ‘Have-drift’ 
(Comrie 1981, Stassen 2009), is yet another type of 'P-drift/ in which non-P arguments 
are reanalyzed as P arguments. 87 concludes and sketches what an explanation for P-drift 


might look like. 


!In this article, I follow the Comrian approach to transitivity and argument roles articulated in Comrie (1981), 
Lazard (2002) and Haspelmath (2011). Basically, transitive clauses are those with A and P as core arguments. 
A and P arguments are those that are coded like the arguments of a prototypical biactant clause in which 
the predicate expresses an action, e.g. ‘kill’ 

21 would like to thank Alena Witzlack-Makarevich for drawing my attention to this similarity. 
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2 The problem: P infixes within a lexical verb 


In Coptic, P indexes are bound to the rightmost edge of the lexical verb in monotransitive 
clauses.? In (1),* the 35G.m index is E 


(1) 


Coptic (Layton 2004: 138) 

a-f-kaa-f e-f-onh 
PST-35G.MA-leave-35G.MP CVB-3SG.M-live\STAT 
‘He left him alive? 


Exceptionally, however, a small number of verbs have interposed P-indexes, which 
occur within the lexical verb. 


(2) 


Coptic (2 Timothy 2:14) 
mar-ou-rpe<u>meeue 
JUSS-3PLA-remember<3PLP>remember 
“Let them; remember them;- 
Coptic (Besa 4:17) 
n-se-tm-rpe<u>6bs 
SEQ-3PLa-NEG-forget<3PLP>forget 
“that they; not forget hem: 
Coptic (Matthew 25:36) 
a-tetn-cmp<a>Sine 
PST-2PLA-Visit<1sGP>visit 


“You visited me. 


When the P argument of these verbs is a lexical noun phrase, on the other hand, it is 
not indexed on the verb at all. Rather, it is marked by an overt case prefix n- (m- before 
labials), glossed here as accusative (Acc). 


(5) 


Coptic (Luke 22:61) 
a-petros  -rpmeeue m-p-Sace m-p-Coeis 
PST-Peter -remember ACC-DEF.M.SG-word  GEN-DEF.M.SG-lord 


“Peter remembered the word of the Lord. 


3The examples presented here are glossed according to the Leipzig Glossing Rules, and are transliterated ac- 
cording to the Leipzig-Jerusalem system (Grossman & Haspelmath 2015). Abbreviations used in the glosses 
in this article, beyond those found in the Leipzig Glossing Rules list, are: AOR — aorist, basically a habitual 
verb form; BG — backgrounder, prefix that marks the verb as topical and, in the present case, an adjunct 
as focal; MOD - modifier marker, sEQ — sequential verb form; srAT — stative verb form. The glossing con- 
vention of a space followed by a hyphen indicates that the morpheme following the hyphen is part of the 
same morphological - but not phonological — word. 

^Examples are cited as found in easily accessible secondary sources, such as Layton (2004), an excellent 
descriptive grammar, or Shisha-Halevy (1988), a learner's chrestomathy based on authentic Coptic texts. 
This is because Coptic texts are usually published in text editions that are not easily available to non- 
specialists. In some cases, I have cited examples from the letters and sermons of Besa, a Coptic abbot. The 
references are to page and line number of Kuhn's edition (Kuhn 1956). 
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(6) Coptic (Luke 7:16) 


a-p-noute -cmpsine | m-pe-f-laos 
PST-DEF.M.SG-God -visit ACC-POSS.M.SG-35G.M-people 
“God visited his people. 


From a synchronic point of view, this is a curious fact: for a small list of verbs, the P 
argument is indexed within the lexical verb. However, from the point of view of language 
change, this unusual feature has a clear explanation. 

In short, itis argued that these verbs are derived, via noun phrase incorporation,’ from 
the compounding of a verbal root and a possessive noun phrase, in which the possessor 
index is prefixed to a lexical noun. Returning to example (4) above, the original structure 
is as follows: 


(7) Coptic (Matthew 25:36) 
a-tetn-cm-p-a-šine 
PST-2PL-find-POSS.M.SG-1SG-report 


“You visited me” (lit. you found my report). 


The possessor prefix (pa-) originates from a construction in which an even earlier pos- 
sessor index (corresponding to Coptic -a 1sG) is suffixed to a demonstrative base (corre- 
sponding to Coptic p- Poss.M.sc). However, in order to demonstrate that the synchronic 
structure involves an interposed P-index, rather than a prefixed possessor index, I show 
that the meanings of the verbs derived via incorporation of these noun phrases are not 
completely predictable. In short, the lexical verb ‘visit’ in Coptic is cmp3ine, and <a> is 
interposed in a position that is synchronically arbitrary but historically explicable. 


3 The background 


3.1 Ancient Egyptian-Coptic 


Ancient Egyptian-Coptic, the indigenous language of Egypt, is an independent branch 
of the Afroasiatic phylum. It is documented from around the turn of the 3'4 millennium 
BCE up until the 13% or 14h century CE, when its last speakers shifted to Arabic; for 
overviews of Ancient Egyptian, see Loprieno (1995); Loprieno & Müller (2012); Gross- 
man & Richter (2015), or Haspelmath (2015a). Coptic, the latest stage of the language, 
is documented in a dozen or so literary dialects, as well as a range of less standardized 
language varieties attested in non-literary texts, such as private letters, legal documents, 
and financial records. The main literary dialects are Sahidic and Bohairic. The data for 
the present article are taken from the Sahidic dialect, which is the best described (Layton 
2004; Reintges 2004; Shisha-Halevy 1986). 


5The notion ‘incorporation’ is usually not extended to constructions in which nominals with phrasal prop- 
erties (e.g. determination, possessor marking, etc.) are attached to verbs. However, some accounts of incor- 
poration do indeed recognize that such nominals may be incorporated (e.g. Aikhenvald 2007; Grossman 
forthcoming), and some languages are described in a straightforward way as incorporating determiners 
and other items typically associated with noun phrases, e.g. Donohue's (1999) description of Warembori. 
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3.2 Grammatical relations in Coptic: a brief overview 


Due to the complexity of Coptic grammatical relations, I focus on the coding properties 
of intransitive and monotransitive verbal clauses, i.e. those with S or A and P as argu- 
ments of the predicate, leaving out ditransitive clauses and clauses with non-accusative 
objects. 83.2.1 deals with argument indexing, 83.2.2 with case marking, and 83.2.3 with 
incorporation. Before proceeding to the presentation of grammatical relations, it is im- 
portant to briefly describe two basic facts of Coptic transitive verbs. First, each lexical 
verb occurs in up to three distinct allomorphs, the conditioning factor being the encod- 
ing of P (see Table 1). The allomorphs are labelled here as distinct stems (represented as 
X with superscript numerals, borrowing a practice of Sino-Tibetan linguistics). 


1. The free form of the verb (2!) occurs when no P is present or when P is overtly 
case marked. It is also the citation form. 


2. A second allomorph (2?) occurs when lexical P is incorporated. 


3. A third allomorph (X?) occurs when P is indexed on the verb.” 


Table 1: Allomorphs of the Coptic verb 


FREE FORM (È!) WITH INCORPORATED P (£?) WITH P INDEX (2?) 


“draw (sword)  tókm tekm- tokm- 
‘drink’ só se- soo- 
find” cine cn- cnt- 


The allomorphs that occur with incorporated P (2?) or indexed P (Z°) are bound forms, 
i.e. they cannot occur as free forms. 

Second, Coptic verbs occur in two main constructions, which will be treated here as 
templates. The first is the Present tense (see Table 2), which comprises two main slots 
(Polotsky 1960). The first slot, for the A argument? is occupied either by a lexical noun 
phrase or a prefixed person index. The second slot is occupied by the lexical verb or by a 
locative expression. P cannot be indexed on the verb, but is rather overtly case-marked. 

The second construction is for all verbal templates other than the Present tense. It com- 
prises three obligatory slots. The first is occupied by a TAM/Polarity prefix, the second 
by an A index, and the third by a lexical verb. P-indexes occur in an optional fourth slot, 
suffixed to the lexical verb. As is discussed in the following section (83.2.1), P-indexes 
and case-marked P are largely in complementary distribution. 


$Guillaume Jacques (p.c.) informs me that it was Georg van Driem who originated this practice. 

TThis presentation is more convenient than precise. Actually, the choice of bound verb stem is conditioned 
by phonological considerations: phonologically light elements condition X?, while phonologically heavy 
elements condition X°. However, almost all person indexes are phonologically light. I would like to thank 
Matthias Müller (p.c.) for reminding me of this. 

$This slot is also the one in which S arguments occur, but since they are not the focus of this article, I ignore 
them here. Coptic argument indexing is nominative-accusative (S=A#P) in terms of linear order. 


133 


Eitan Grossman 


Table 2: The structure of the Present tense verb 


A/S Lexical verb (P) 
ti- -sops (mmo-k) 
1SG entreat (ACC-2M.SG) 


‘I entreat you. 


Table 3: The structure of non-Present tense verbs 


TAM/Polarity A/S Lexical verb (P) 


a- -f- -tamio- -ou 
PAST 3SG.M create 3PL 
“He created them' (Shenoute, cited in Shisha-Halevy 1988: 34). 


3.2.1 Indexing 


In monotransitive clauses, A and P can be indexed on the verb. Argument indexing is not 
obligatory. A given monotransitive verb can occur with an A index (8), a P index (9), both 
(10), or neither (11). Bound A indexes are prefixed to the lexical verb (or an auxiliary verb), 
and if an overt TAM/Polarity prefix is present, the latter precedes the person index. In 
order to simplify the presentation, the following examples are taken from the Past tense, 
whose basic structure is presented in Table 3 above): 


(8) Coptic (Besa 46:26) 
a-u-sótp n-ne-u-hiooue 
PST-3PLA-choose ACC-POSS.PL-3PL-Ways 


“They have chosen their ways? 


(9) Coptic (Shenoute, cited in Shisha-Halevy 1988: 34) 


a-p-Coeis -tsto-ou ebol 
PST-DEF.M.SG-lord  -reject-3PLP out 
“The Lord rejected them. 


(10) Coptic (Besa 45:32) 
a-u-tamo-n 
PST-3PLA-inform-1PLP 
“They informed us: 

(11) Coptic (Shenoute, cited in Shisha-Halevy 1988: 35) 
a-n-daimonion -soun-p-Coeis 
PST-DEF.PL-demon  -know-DEF.M.sG-lord 
“The demons knew the Lord? 
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Coptic has DOI, since person indexes can either be suffixed to the verb (12) or case- 
marked (13), even for one and the same verb in the same verbal construction, e.g. the 
Past tense: 


(12) Coptic (Matthew 13:48, cited in Layton 2004: 132) 
n-et-hoou-de a-f-noc-ou ebol 
DEF.PL-REL-bad\sTAT=PTCL PST-3SG.M-cast-3PL out 


"Ihe bad ones, he cast them out? 


(13) Coptic (Luke 4:35, cited in Layton 2004: 132) 
a-f-nouce=de mmo-f 
PST-3SG.M-Cast-PTCL ACC-3SG.M 
“And he threw him down? 


At present, there is no account of Coptic DOI, so I will not speculate on the functions 
associated with it. What is important to establish in the present context is that P-indexes 
are suffixed to the lexical verb, and cannot occur elsewhere within the verbal bound 


group. 


3.2.2 Case marking 


Coptic has a cross-linguistically unusual case-marking system: both the Nominative (14) 
and the Accusative (15) are overtly marked by prefixed case markers, but neither of these 
is the citation form. The citation form is the bare noun form, which is simply a nom- 
inal stem without case markers or other inflectional material, such as (in)definiteness 
or number-gender markers. Such case-marking systems have been called “marked A/S 
vs. marked P' by Creissels 2009 (see also Grossman 2015). Moreover, noun phrases are 
overtly case-marked only if they are postverbal; if they are preverbal or incorporated 
into the verb, they are not case-marked. 


(14) Coptic (Luke 1:12) 
a-f-Stortr=de nci-zakharias 
PST-35G.M-be.troubled=PTCL  Now-Zacharias 


“But Zacharias was troubled: 

(15) Coptic (Luke 1:36) 
a-s-Ó n-ou-sére 
PST-3SG.F-conceive ACC-INDEF-son 


“She conceived a son: 


Examples (16) and (17) show the main constructions involved in Coptic DOM: lexical 
P must be either incorporated or overtly case-marked. 
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(16) Coptic (Mark 4:36) 
a-f-ka-p-méése 
PST-3SG.M-leave-DEF.M.SG-multitude 
‘He left the multitude: 


(17) Coptic (Matthew 13:36) 
a-f-kó m-p-méése 
PST-3G.M-leave  ACC-DEF.M.sG-multitude 
“He left the multitude: 


The conditions regulating Coptic DOM are complex, and involve both an aspectual 
split and discourse conditions that are still poorly understood and may vary from dialect 
to dialect and even from corpus to corpus (Engsheden 2008). However, there are some 
broad regularities. 

First of all, in the Present tense and in verbal constructions built on the Present tense 
(e.g. the Imperfect), DOM is strictly regulated by what is traditionally seen as definite- 
ness, but which could also be seen as a matter of referentiality: bare nouns stems, which 
tend to have non-referential semantics, are obligatorily incorporated into the verb; in 
(18), for example, the noun stem daimonion ‘demon(s)’ is non-referential. On the other 
hand, referential noun phrases of any sort are obligatorily case marked, as in (19), in 
which daimonion is referential and bears an indefiniteness prefix. 


(18) Coptic (Luke 11:15, cited in Layton 2004: 132) 
e-f-nec-daimonion ebol  hn-beelzeboul 
BG.PRS-35G.M-cast-demon out  in-Beelzebul 


“He casts out demons by means of Beelzebul. 


(19) Coptic (Luke 11:14, cited in Layton 2004: 132) 
ne-f-nouce=de ebol | n-ou-daimonion 
IMPF-3SG.M-cast=PTCL out  ACC-INDEF.SG-demon 


“He cast out a demon. 


This extends to bound person markers as well: since person markers are referential by 
nature, they cannot be indexed on the verb and must receive overt case marking, as in 
(20). 


(20) Coptic (Acts 13:46, cited in Layton 2004: 236) 
tetn-nouée ` mmo-f ebol 
2PL.PRS-Cast ACC-3SG.M out 


“You cast it out? 


Outside of the Present tense and related constructions (e.g. the Imperfect), it is still the 
case that bare noun stems are obligatorily incorporated into the verb, i.e. they cannot 
bear overt accusative case (21). On the other hand, noun phrases can either be case- 
marked (22) or incorporated (23), the conditioning factors governing the alternation still 
being unclear. 
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(21) Coptic (1 Timothy 5:23) 
mpr-se-moou 
PROH-drink-water 


‘Don’t drink water!’ 


(22) Coptic (Matthew 26:51) 
a-f-tókm n-te-f-séfe 
PST-3SG.M-draw ^ ACC-POSS.F.SG-3SG.M-sword 


“He drew his sword? 


(23) Coptic (Mark 14:47) 
a-f-tekm-te-f-séfe 
PST-3SG.M-draw-POSS.F.SG-3SG.M-sword 


“He drew his sword? 


3.2.3 Incorporation 


As discussed in §3.2.2 above, lexical A, S, or P can be incorporated into the verb. A/S 
incorporation is unexpected from a cross-linguistic view” but it is unimportant for the 
present discussion; I will focus here on P-incorporation, which is highly productive in 
Coptic. 

Nouns referring to body parts are often incorporated in Coptic, as in other languages 
(Mithun 1984; 1986; Mithun & Corbett 1999), and these body part terms often bear pos- 
sessor indexes, as in (24) and (25). 


(24) Coptic (Besa 10:24) 
a-f-ka-toot-f 
PST-35G.M-put-hand-35G.M 
“He ceased" (lit. ‘he put his hand”) 


(25) Coptic (Besa 3:30) 
n-tn-smn-toot-n 
SEQ-1PL-establish-hand-1PL 


“And let us agree? (lit. ‘let us establish our hand") 


The free forms (2!) of these verbs are, respectively, kó ‘put’ and smine “establish What 
is noteworthy in these constructions is that the possessive suffixes can be analyzed as 
P-indexes, since incorporation of body parts produces new verbs whose meaning is not 
transparently predictable from the sum of the verbal and nominal roots. In other words, 
(24) and (25) above could be analyzed as follows in (26) and (27), with A and P being 
coreferential, and the construction as a whole being reflexive. 


?An anonymous reviewer has drawn my attention to Zavala (2000), which argues that Olutec (Mixean) 
allows the incorporation of A. 
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(26) Coptic (Besa 10:24) 
a-f-katoot-f 
PST-35G.M-cease-38G.M 
“He ceased’ (lit. “he put his hand’). 


(27) Coptic (Besa 3:30) 
n-tn-smntoot-n 
SEQ-1PL-agree-1PL 


“And let us agree” (lit. let us establish our hand”). 


It is important to note that this reanalysis is plausible, since these possessive suffixes 
are a relic of an earlier head-marking possessive construction, in which possessor in- 
dexes are suffixed directly to the possessum, as in (28) (Egedi 2010; Haspelmath 2015b). 


(28) Earlier Egyptian (Allen 2013: 102, 124) 
rn-k pr-k 
name-2M.sG house-2M.SG 


“your name” “your house” 


In Coptic, however, these suffixes are nearly obsolete, and occur only on a small list of 
body parts and other inalienable nouns. The most frequent — and the only productive — 
possessive construction in Coptic comprises a possessive prefix, which in turn comprises 
a pronominal base that shows number (singular vs. plural), and gender (masculine vs. 
feminine) distinction in the singular, to which a possessor index attaches, as in (29). 


(29) Coptic (Matthew 7:22, John 10:3) 
pe-k-ran ne-u-ran 
POSS.M.SG-2M.SG-name  POSS.PL-3PL-name 


your name their names 


Moreover, many of the nouns denoting body parts in the incorporation construction 
are themselves obsolete as independent lexical items, and they occur almost exclusively 
as parts of noun-verb compounds such as those in (24) and (25), or as parts of preposi- 
tions, as in (30). As such, they can be treated as “obligatorily possessed nouns' (Nichols 
& Bickel 2005). A short list of forms used as bound roots are compared with the free 
forms in Table 4 (for a full list, see Layton 2004: 102-104). 


(30) Coptic (Matthew 5:25) 
etoot-f ^ (<e-toot-f) 
to-3S5G.M  to-hand-35G.M 


“to him’ ‘to his hand’ 


One can assume that at least arguably, the erstwhile possessor indexes have been 
reanalyzed as P suffixes, due to the following reasons: (a) the possessor suffixes are not 
a productive strategy for marking the possessor on nouns, (b) the noun roots to which 
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Table 4: Bound forms and free forms of nouns denoting body parts 


MEANING BOUND FORM FREE FORM 


hand” toot- cic 
‘foot’ rat- oueréte 
‘eye’ eiat- bal 
‘head’ čô- ape 


they attach are not identifiable as free forms with a lexical meaning, and (c) the meaning 
of the incorporated constructions are not transparent. This is further corroborated by 
the fact that P-indexes of underived verbs are also suffixed to the lexical verb, which 
plausibly would have enhanced the likelihood of possessor indexes being reanalyzed as 
P-indexes. 

These facts about Coptic will be used to explain the origin of infixed P-indexes that 
occur within the lexical verb. In the next section, it is shown that the Coptic possessor 
prefix developed, in part, from an earlier possessor suffix. 


4 From suffix to prefix in the coding of possessors 


The diachronic relationship between the two ways of indexing the possessor in a posses- 
sive phrase, i.e. via possessor suffixes (28) or possessor prefixes (29) is well-documented 
in the history of Ancient Egyptian. The head-marking construction with a possessor in- 
dex suffixed to the noun denoting the possessum (28) is, historically speaking, the older 
construction, attested from the very beginning of the textual record. 

A competing construction, which emerged relatively early in the textual record, com- 
prises a demonstrative pronoun (p3 y), to which the possessor index (e.g. -f) was suffixed. 
One of the earliest examples documented is shown in (31). 


(31) Old Egyptian (cited in Sojic forthcoming) 


p3y-f hrw 
DEM-35G.M day 


“his day” 


This newer construction rose in frequency over the course of Ancient Egyptian di- 
achrony, but remained in variation with the older construction until thousands of years 
after the new construction is first documented (Gardiner forthcoming; Sojic forthcoming; 
Winand forthcoming). For example, in the 14th century BCE, we find the two construc- 
tions as variants at the same time in the same type of text. The earlier construction is 
found in (32), the innovative one in (33). 
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(32) Late Egyptian (cited in Sojic forthcoming) 
msc-f 
army-3SG.M 
‘his army’ 

(33) Late Egyptian (cited in Sojic forthcoming) 


pyf mër 
POSS.M.SG-35G.M army 


“his army” 

By the time of Coptic, the latest stage of the language, the new construction has be- 
come bound to the possessum, becoming in effect a prefixed possessor index (Grossman 
forthcoming), as in (34): 

(34) Coptic (Matthew 1:23) 


pe-f-ran 
POSS.M.SG-3SG.M-name 


‘his name’ 


In brief, the diachronic change observed here can be represented schematically as in 
Table 5. 


Table 5: The diachrony of possessor infixes in Ancient Egyptian-Coptic 


Possessor index 


Stage 1 suffix only (rn-k name-2M.sc ‘your name’) 

Stage 2 suffix productive (rn-k), preposed possessor index (p3y-k rn) begins to 
emerge 

Stage 3 suffix and preposed possessor index in variation (rn-k vs. p3y-k rn) 

Stage 4 (a) preposed possessor index becomes prefixed to noun (pe-k-ran) (b) 
prefix productive, suffix limited to a small set of nouns 


We now turn to the development of an interposed P-index from the prefixed possessor 
index in (34). 


5 From prefix to infix in the coding of P 


As mentioned above in §3, Coptic has a productive noun incorporation construction, in 
which nouns in P role are attached to a bound form of the verb. Unusually from a cross- 
linguistic point of view, not only bare noun roots but also referential noun phrases can 
be incorporated in tenses other than the present. 


For a full account of the diachrony of the two possessive constructions in the history of Egyptian, see 
Gardiner (forthcoming), Sojic (forthcoming) and Winand (forthcoming), as well as Haspelmath (2015b) 
and Kammerzell (2000), which are typologically-oriented. 
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For one thing, incorporated nouns can bear overt (in)definiteness marking, as in (11) 
above, repeated here as (35) for convenience. 


(35) Coptic (Shenoute, cited in Shisha-Halevy 1988: 35) 
a-n-daimonion -soun-p-Coeis 
PST-DEF.PL-demon -know-DeEF.M.sc-lord 


“The demons knew the Lord? 
Moreover, incorporated nouns can be quantified (36) or modified adjectivally (37). 


(36) Coptic (Shenoute, cited in Shisha-Halevy 1988: 37) 
mp-f-ka-ce-hób 
PST.NEG-35G.M-put-another-thing 


“He did not leave another thing: 


(37) Coptic (Mark 2:22, cited in Layton 2004: 132) 
mere-laau -nec-érp b-brre e-hót n-as 
AOR.NEG-anyone -throw-wine MOD-new  to-wineskin Mop-old 


‘No one puts new wine into old wineskins? 
Incorporated noun phrases can be referred to anaphorically, as in (38). 


(38) Coptic (Besa 9:31) 
mp-ou-oues-pe-smou a-f-pót ebol mmo-ou 
PST.NEG-3PL-love-DEF.M.SG-blessing  PST-35G.M-flee out  OBL-3PL 


“They did not love the blessing, and it fled away from them. 


Crucially, incorporated nouns can be marked as possessed in at least three ways. The 
first is when erstwhile possessive suffixes attach to incorporated body parts, as in (24)- 
(25) above. The second way is when the possessor is a lexical noun phrase, which follows 
the incorporated noun and is marked as dependent by the Genitive prefix n-, as in (39) 
and (40). 


(39) Coptic (Besa 2:23) 


mar-n-r-p-meeue n-ne-nt-a-pe-n-eiót 
JUSS-1PL-do-pEF.M.sc-thought! GEN-DEF.PL-REL-PST-POSS.M.SG-1PL-father 
-Co00-u 

-say-3PL 


“Let us remember those things that our father has said’ (lit. Let us do the 
thought of the things that our father has said"). 


"The lexical noun meeue means “thought; but the derived verb rpmeeue (lit. “do the thought’ means 
“remember! 
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(40) Coptic (Besa 4:19) 
e-r-p-ôbš n-n-entolê m-p-noute 
INF-do-DEF.M.sG-forget GEN-DEF.PL-commandment GEN-DEF.M.SG-god 
“to forget the commandments of God' (lit. ‘to do the forgetting of the 
commandments of God”) 


The third way is by means of the possessor prefix described in 84. In (41), the possessor 
prefix pes- is part of the incorporated nominal. 


(41) Coptic (Hebrews 13:2, cited in Layton 2004: 142) 
t-mntmaismmo mpr-r-pe-s-óbs 
DEF.F.sG-hospitality  PROH-do-POSS.M.SG-35G.F-forget 


“As for hospitality, do not forget it’ 


The question is whether the verbs rpmeeue ‘remember; rpób$ ‘forget; and cmpsine 
“visit” are synchronically analyzable as compositionally derived from a verb root and a 
possessive noun phrase, or whether they are better treated as distinct lexical items with 
no internal structure. 

A point in favor of the former analysis is the fact that their derivational history is 
clear, and their component parts all exist as independent lexical items in Coptic. On the 
other hand, in favor of the latter is the fact that they have a distinct lexical meaning that 
is unpredictable from the original components. For example, the bound verb form (2?) r- 
'do'? is commonly used to derive verbs from nouns, e.g. nobe ‘sin’ vs. r-nobe “to sin’ In 
the case of rpmeeue, it does not derive a verb from meeue, which means 'think, thought, 
opinion; but rather from pmeeue, which means 'remembrance; and rpmeeue means 'to 
remember, to be mindful of? 

Similarly, cmpSine is the result of the compounding of the verb 'cn- (free form cine) 
‘find’ and psine “visit, itself derived from Sine, which means ‘to ask, to inquire, to visit; or, 
‘inquiry, news, report. In this case, the derived noun lexicalizes only a narrow part of the 
polysemy network of the underived noun. If Sine means “to ask, to inquire, to visit; psine 
lexicalizes only ‘visit? and the derived verb cmpSine lexicalizes this meaning. I take this as 
evidence that the meaning of the verbs derived via incorporation is not fully predictable 
from its components, and as such, that verbs like rpmeeue or cmpsine are synchronically 
distinct form-meaning pairings. This is typical of some types of incorporation (Mithun 
& Corbett 1999). 

Another argument in favor of analyzing these derived verbs as synchronically simple 
verbs is that the genitive prefix that marks lexical noun possessors of the incorporated 
noun phrase is homonymous with the accusative case prefix. Compare the genitive prefix 
in (42) with the accusative prefix in (43). In (42), the original structure ofthe construction 
can be glossed as let us do the thought of those things that our father has said, with 
the incorporation of p-meeue ‘the-thought. The genitive prefix n- marks the determined 
relative clause (‘those things that our father has said”). In (43), the accusative prefix n- 
simply marks the P argument. 


12 The corresponding free form (21) is eire. 
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(42) Coptic (Besa 2:23) 
mar-n-r-p-meeue n-ne-nt-a-pe-n-eiót 
JUSS-1PL-do-DEF.M.sG-thought  GEN-DEF.PL-REL-PST-POSS.M.SG-1PL-father 
-Č00-U 
-say-3PL 
‘Let us remember those things that our father has said” (lit. ‘Let us do the 
remembrance of those things that our father has said”). 


(43) Coptic (Besa 46:26) 
a-u-sótp n-ne-u-hiooue 
PST-3PLA-choose ACC-POSS.PL-3PL-ways 


“They have chosen their ways? 


These prefixes are diachronically distinct (Winand 2015), but in this particular envi- 
ronment, they are homonymous. This homonymy would plausibly lead to the reanalysis 
of the genitive prefix in this context as the accusative prefix, i.e.: 


(44) Coptic (Besa 2:23) 
mar-n-rpmeeue n-ne-nt-a-pe-n-eiót -€00-U 
JUSS-1PL-remember ACC-DEF.PL-REL-PST-POSS.M.SG-1PL-father -say-3PL 


“Let us remember those things that our father has said 


If the verbs discussed here are analyzed as distinct lexical items, the person indexes in 
(45)-(47) are interpositions, occurring synchronically at an arbitrary position. Diachron- 
ically, however, they are simply in the position of earlier possessor indexes, which were 
prefixed to incorporated possessed nouns. For example, in (45)- (47), the P interposition 
is in the position of the earlier possessor index, which occurred between the earlier lex- 
ical verb and the possessed noun. 


(45) Coptic (2 Timothy 2:14) 
mar-ou-rpe<u>meeue 
JUSS-3PLA-remember<3PLP>remember 


‘Let them; remember them;- 

(46) Coptic (Besa 4:17) 
n-se-tm-rpe<u>ób3 
SEQ-3PLA-NEG-forget<3PLP>forget 
“that they; not forget them: 


(47) Coptic (Matthew 25:36) 
a-tetn-cmp<a>Sine 
PST-2PLA-Visit<1sGP>visit 


“You visited me. 
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The pathway of change sketched in this article shows one way that an affix can move 
without moving. The constellation of changes involved is complex, and involves the inter- 
action of multiple grammatical systems. To summarize, I have argued that the following 
changes led to a suffix becoming a prefix, and this prefix becoming an infix, or more 
properly, an interposition: 


1. First, an old head-marking possessive construction involving suffixed possessor 
indexes is superseded by a newer construction in which the possessor index is 
suffixed to a demonstrative, the entire construction grammaticalizing into a pos- 
sessive prefix with the possessor index prefixed to the possessum noun. 


2. Later on, noun phrases comprising the newer possessor prefix undergo incorpora- 
tion, with the resulting derived verb being a synchronically distinct form-function 
pairing whose meaning is not fully predictable from its component parts. 


3. Once incorporated, the possessor index is reanalyzed as a P-index, which is infixed, 
or more properly, interposed, within the lexical verb. The process of reanalysis was 
facilitated by the homonymy of the prefix n-, which marks both lexical possessors 
(GEN) and lexical P arguments (Acc). As such, the postverbal possessor of the in- 
corporated noun was reanalyzed as a postverbal P. 


This complex series of changes is represented schematically, and with much flattening 
out of actual diachrony, in Figure 1: 


Construction 
possessor index suffixed to noun X-f ‘his X’ 
{ 
development of new preposed possessive article from p3y-f X ‘his X 


DEMONSTRATIVE+POSSESSOR SUFFIX 


possessive article becomes bound to noun, possessor pef-X ‘his X’ 
index becomes prefix on noun 


possessed nouns incorporated into verbs V-pef-X ‘to V his X’ 


loss of compositional semantics, reanalysis of genitive as 
accusative 


reanalysis of verb as bipartite stem, reanalysis of V1-f-V2 ‘to V him’ 
possessor prefix as interposed P index 


Figure 1: Schematic representation of the change from suffix to prefix to inter- 
position 
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6 A broader view: P-drift? 


Taking a broader view of the complex change here, it might be possible to speak of P- 
drift or direct object drift, in which certain non-P clause participants, given the right 
circumstances, are preferentially reanalyzed as P. This is, in a sense, inverting — but also 
broadening - the phenomena associated with *Have-drift” (Comrie 1981; Stassen 2009), 
in which intransitive predicative possession constructions gradually acquire properties 
associated with transitivity. Such a process also occurred in Ancient Egyptian-Coptic, 
in which existential-locative constructions gradually acquired DOM properties, i.e. the 
alternation between possessum incorporation and overt accusative marking. 

In the first stage, the possessum noun occurred between a clause-initial existential 
marker and a clause-final locative preposition, as in (48). 


(48) Late Egyptian (Late Ramesside Letters 19:15) 
wn hmt im m-di-k 
EXIST copper there roc-hand-2w.sc 


"You have copper’ (lit. ‘there is copper in your hand’). 


The existential marker wn and the locative preposition m-di- (‘in-hand_of’) under- 
went univerbation, with the loss of the locative preposition, which left the possessum 
after the bound person marker, resulting in structures like that in (49). 


(49) Late Egyptian (P. Moscow 120, 1,58) 
in — wn-di-f ist ` hän 
INT EXIST-in_hand=3sc.m crew Syrian 


"Does he have a Syrian crew?' 


By the time of Coptic, the possessor is bound to the possessive predicate ounta-, and 
the lexical possessum can be marked by the accusative prefix n-, as in (50). 


(50) Coptic (John 10:16, cited in Layton 2004: 306) 
ounta-i-on mmau  n-hen-ke-esoou 
Poss-1sG=also there ^ AcC-INDEF.PL-other-sheep 


‘T have other sheep too. 
The possessum can also be incorporated, as in (51). 


(51) Coptic (Matthew 8:20, cited in Layton 2004: 308) 
n-basor ounta-u-ne-u-béb 
DEF.PL-fox  POSS-3PL-POSS.PL-3PL-hole 


“As for foxes, they have their holes. 


In Coptic, these constructions also acquired the DSM properties of transitive clauses in 
Coptic (Grossman 2015), with lexical possessor incorporation (52) alternating with overt 
nominative marking on the lexical possessor (53). In (52), the noun phrase referring to 
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the possessor (“the servant") is incorporated into the possessive predicate ounte-, while 
in (53), the lexical possessor (“the son”) occurs after the possessive predicate, which bears 
a person marker (-f) that indexes the possessor. 


(52) Coptic (Luke 17:9, cited in Layton 2004: 307) 
mé  ounte-p-hmhal hmot 
Q POSS-DEF.M.sG-servant thanks 


“Does the servant have any thanks?” 


(53) Coptic (Mark 2:10, cited in Layton 2004: 308) 
ount-f-eksousia mmau  nci-p-Sére m-p-róme 
POSS-3SG.M-authority there ^ NOM-DEF.M.SG-son  GEN-DEF.M.SG-man 
e-ka-nobe ebol 
INF-put-sin out 


"Ihe son of man has authority to forgive sins: 


Compare with A/S-incorporation (54) vs. nominative case marking (55) in monotran- 
sitive verbal clauses: 


(54) Coptic (Mark 15:2) 
a-pilatos ` -énou-f 
PsT-Pilate  ask-35G.M 
‘Pilate asked him? 


(55) Coptic (Mark 13:3) 


a-f-énou-f nci-petros 
PST-3sG.M-ask-38G.M NOM-Peter 
“Peter asked him. 


In other words, in terms of indexing and case-marking, Coptic possessors behave like 
A and possessums behave like P. 

While the examples of Ancient Egyptian-Coptic ‘HAvE-drift’ sketched above provide 
additional data for an already established pathway, the present study shows yet another 
pathway in which possessors are reinterpreted as A and possessums as P, namely, via 
the incorporation of body parts with possessor indexes in the same position as P in- 
dexes in underived verbs. This in turn provides evidence that transitivization is not 
a single pathway, especially if we take into account pathways like those described in 
Gildea (1998) for Cariban languages, e.g. POSSESSOR > NOMINATIVE, and POSSESSOR > 
ERGATIVE. These changes, interestingly, involve nominalizations being reinterpreted as 
main clauses, which is strikingly different from what we find in Coptic. 

However, since synchronic polysemies of case-markers as well as diachronic evidence 
indicate that other pathways are possible, POSSESSOR > ACCUSATIVE (also in Gildea 1998), 
the motivations and mechanisms of P-drift still remain in need of clarification. A possible 
explanation might be found in Serzant (2013: 303), which explains the development of 
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canonical subject coding, e.g. nominative case marking, by appealing to semantics, argu- 
ing that "the consistent endowment of a constituent with some functional properties of 
a prototypical subject is the main catalyst for the (re)assignment of subject coding and 
behavioral properties to that constituent; it is an adjustment of grammatical properties 
to function” SerZant formulates the diachronic universal as follows (2013: 303): 


Consistent functional-semantic overlap of an oblique case-marked constituent with 
the prototypical subject may trigger the (re)assignment of the subject coding and 
behavioral properties to that constituent if there are no other constituents in the 
construction that would show even greater overlap. 


Since possessors often have the semantic and discourse properties of prototypical sub- 
jects (e.g. animacy, topicality), and possessums often have the semantic and discourse 
properties of prototypical objects (e.g. inanimacy, focality), the way is paved for the mor- 
phosyntactic coding properties of the possession construction to be ‘adjusted’ to fit its 
semantics. In the case of Coptic, these coding properties mainly involve the participation 
in DSM (the alternation between nominative marking and incorporation) and DOM (the 
alternation between accusative marking and incorporation). 


7 Conclusions 


The phenomenon of bipartite stems with person interpositions seems to be quite rare, 
cross-linguistically. Bipartite stems with person interpositions have been documented 
only in several language families spoken in a fairly small number of areas (Bickel & 
Nichols 2007; Hildebrandt 2005). The diachronic pathways through which bipartite stems 
develop are assumed to include relics of derivational morphology or compounding, or 
infixation that has become morphologized (Bickel & Nichols 2007: 199, DeLancey 1996), 
the movement and entrapment of clitics (Nichols 2003), or the copying of affixes from 
another construction type, e.g. head class markers from nouns to verbs (Nichols 2003). 
Ancient Egyptian-Coptic presents us with a particular pathway of development that is 
close to the reanalysis of compounding, since compounding and incorporation are re- 
lated morphological processes, and in some views, incorporation is a particular type of 
compounding (Mithun & Corbett 1999). 

However, actual diachronic studies - in documented historical corpora - of the devel- 
opment of bipartite stems and interpositions are few and far between; previous research 
on bipartite stems has leaned heavily on reconstruction. The present case study shows 
how complex the development of bipartite stems and interpositions can be, since it is 
the specific interaction of Differential Object Marking - the alternation between overt 
accusative case marking vs. incorporation of possessed nouns - and Differential Object 
Indexing - the complementary distribution between object marking and object indexing, 
that led to the reanalysis of possessor indexes as P indexes, and more specifically, to the 
reanalysis of possessor prefixes as P indexes interposed within a simplex verb stem. 
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Abbreviations 

1 first person INT intransitive 

2 second person IMPF imperfect 

3 third person JUSS  jussive 

A agent-like argument of LOC locative 
canonical transitive verb M masculine 

ACC accusative Mop modifier marker 

AOR aorist (habitual verb form) NEG negation, negative 

ART article NOM nominative 

BG backgrounder, prefix that P patient-like argument of 
marks the verb as canonical transitive verb 
topical/an adjunct as focal PL plural 

CNVB  converb POSS possessive 

DAT dative PRS present 

DEF definite PROH prohibitive 

DEM demonstrative PST past 

DET determiner PTCL particle 

EXIST existential PTCP participle 

F feminine Q question particle/marker 

FOC focus REL relative 

GEN genitive SG singular 

INDEF indefinite SEQ sequential verb form 

INF infinitive STAT stative verb form 
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Chapter 6 


Verbal semantics and differential object 
marking in Lycopolitan Coptic 


Ake Engsheden 


Stockholm University, Department of Archaeology and Classical Studies 


This paper seeks to clarify the role of affectedness for the marking of direct objects through 
an analysis of a corpus of Lycopolitan Coptic texts (4th to 5th centuries AD). Whereas previ- 
ous research has shown the importance of definiteness for the use of the direct object marker 
n with the so-called imperfective tenses (present and imperfect), it has proven more diffi- 
cult to establish why it alternates in the non-imperfective with a zero marker. An attempt 
is made here to correlate the two different object constructions to Tsunoda's verb-type hier- 
archy, which was conceived to capture the degree of affectedness. It appears that the more 
affected a direct object is, the more likely it is to receive the direct object marker; whenever 
the object is little affected or unaffected, the zero-marked construction is preferred. 


1 Introduction 


Most works that have tried to explain Differential Object Marking (DOM) have focused 
on the semantic and information-structural properties of the direct object (animacy, defi- 
niteness, specificity, or topicality). There are a few languages for which the identification 
of the triggering factor behind DOM may be quite straightforward, such as definiteness 
in Modern Hebrew (Danon 2001) or specificity in Turkish (Enc 1991), but more commonly 
a multidimensional DOM system results not from a single factor, but from several inter- 
acting factors 

One language with a multidimensional DOM is Coptic (Afro-Asiatic, Egyptian branch, 
now extinct).! Coptic DOM has received far less attention one might expect, given that 
Coptic has a long tradition in academic studies. Indeed, it is still unclear what exact fac- 
tors are operative and how they relate to each other. The present study aims to show how 
the verb type, which is defined through the degree of affectedness found with the object, 


Egyptian is divided into the following language stages: Old Egyptian (c. 3100-2000 BC), Middle Egyptian 
(2000-1350 BC), Late Egyptian (1350-700 BC), Demotic (700 BC-AD 452) and Coptic (AD 200-1400). For 
a useful grammatical overview see Haspelmath (2015). For a detailed diachronic description aimed at a 
linguistic readership, see Loprieno (1995). 


Ake Engsheden. Verbal semantics and differential object marking in Lycopoli- 
tan Coptic. In Ija A. Seržant & Alena Witzlack-Makarevich (eds.), Diachrony 
| of differential argument marking, 137-162. Berlin: Language Science Press. 
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influences whether the object is marked as such or receives no marking. This will be done 
through a corpus-based study of Lycopolitan, an early literary variety (traditionally and 
henceforth “dialect”) of Coptic that was prevalent in the 4th and 5th centuries AD. The 
analysis indicates that the overtly marked construction is favoured by the presence of 
a highly affected object, whereas the zero-marked construction is favoured whenever 
the object is little affected or unaffected by the verbal action. Beside the value of such 
a study for our understanding of argument marking in Coptic itself, a wider knowledge 
of Coptic data should be of interest to linguistics because Coptic presents a system that 
is markedly different from better-explored patterns of DOM. 

This paper is structured as follows. 82 provides a synthesis of Coptic object marking, 
including a summary of previous research. 83 contains a short description of the corpus 
of Lycopolitan texts and presents some background data. $4 presents an overview of the 
role of verbal semantics in research into DOM, and introduces some theoretical work 
on how verbs can be arranged on semantic grounds in broad verb-type categories. In $5, 
statistics are provided for the realisation of the object in Lycopolitan Coptic in relation 
to the verb types. The analysis suggests that the distribution of two alternating object 
constructions depends on the degree of affectedness of the object. In 86, the relationship 
between affectedness and other factors is discussed. Finally, $7 contains a summary and 
preliminary conclusions. 


2 Argument realisation in Coptic 


Coptic DOM is of the asymmetric type (de Hoop & Malchukov 2008; Iemmolo 2013), in 
which the direct object is either overtly marked with a preposition or zero-marked. The 
marker before NPs is a preposition, n (before labials m), the origin of which is ultimately 
locative. A longer form, mma, is used preceding the clitic person markers.? Both are 
subsumed in the following under the term n-marking. Note that one often has a double 
marking of the transitive construction, because many verbs have separate allomorphs 
depending on which object construction is used. The verbal allomorphs are, by and large, 
distinguished by different vowels because of the shape of the syllable and stress rules. 
The n-marked object appears only after the regular stem of the verb with a full vowel 
carrying stress (e.g. nouje “to throw”, see Table 1). A zero-marked NP, on the other hand, 
can appear both after the regular stem and as an allomorph of the verb with a reduced 
vowel? For some morphological classes of the verb only one allomorph is used before 
zero-marked NPs and personal pronouns. Thus, the verb “to throw” can assume the form 
naj before NPs and personal clitics (exemplified in Table 1, through the 3 msg. pronoun 


f) 


“In Sahidic, the supra-regional dialect of the south, the equivalent forms are n and mmo. Both forms derive 
from the preposition m, used in older Egyptian for location in something ('essive”) as well as for motion 
away from something (‘elative’), from whence derives the partitive meaning that seems to have given rise 
to object marking (Winand 2015). 

*Only the latter is possible in many other dialects. I have deliberately not distinguished these two cases in 
the counts in the tables, because I wish to avoid a digression on the morphology of the verb classes. 
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Table 1: Verb allomorphs and object marking in Lycopolitan 


n-marked O nouje n NP /nouje mma-f — 
zero-marked O nouj NP / — naj-NP / naj-f 


There are also verbs that have different allomorphs with zero-marked objects, depend- 
ing on what follows them. For instance, the verb eire ‘to make’ or “to do’ assumes the 
form r in front of NPs, while it becomes ee alt. eit in front of personal clitics. 

The rules governing the selection between the n-marked form and the zero-marked 
form are far from clear. A few important observations that have been made in the past 
are summarised here and in the following subsections ($2.1-82.2). 

Case marking occurs only in the post-verbal position (this also applies to subject mark- 
ing, see Grossman 2015). When an object is fronted, a familiar strategy for topicalisation, 
it is then not case-marked but is resumed postverbally through the appropriate person 
marker. Both n-marked and zero-marked objects occur (1a-1b): 


(1) a. t-mnt-lilou a-i-t'bio mma-s 
DEF.F-ABST-youth  PsT-1sG-subdue ACC-3F.SG 
"Youth I subdued' (Psalm-book 88, 27) 
b. eis  p-kah m-p-keke a-n-Sab-f 
PTCL DEF.M-land  GEN-DEF.M-darkness  PsT-1PL-devastate-3M.sG 
“Look, the Land of Darkness we devastated” (Psalm-book 201, 23) 


Object marking with n/mma is also found in some non-differential contexts. For ex- 
ample, n-marking must be used whenever the direct object is separated from the verb 
by any element. In the following example (2), the object (tef-hikón) is preceded by n due 
to the placement of the verbal particle abal. Zero-marked objects are only allowed when 
the object directly follows the verb with no intervening element. 


(2) Sa-p-séu etere  p-iót na-cólp abal n-tef-hikón 
until-DEF.M-time REL  DEr.M-father FUT-reveal PTCL ACC-POSS.3M.SG-image 
n-t-pe mma-u / ('"cólp abal tef-hikón) 


ADV-DEF.F-sky  PREP-them / “reveal PTCL POSS.3M.SG-image 
“until the time when the Father will reveal his image above them’ (Kephalaia 103, 
6) 


Furthermore, the majority of verbs borrowed from Greek have their objects introduced 
with n/mma (3). This is determined by the valency of the verb and is not a differential 
environment and, consequently, Greek loan-verbs are not treated in this paper. 


(3) a-s-panhoplize  mma-f 
PST-3F.SG-arm  ACC-3M.SG 
“She armed him' (Kephalaia 39, 4) 
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It has long been recognised that n-marking is used with an NP only when the latter is 
determined by any of the articles (definite or indefinite), the possessive determiner, or 
a demonstrative. The n-marking is not used with a bare noun, which signals a generic 
and indefinite sense. It would thus seem as if Coptic DOM conforms to the definiteness 
hierarchy: personal pronoun > proper noun > definite NP > indefinite specific NP > non- 
specific NP (e.g. Aissen 2003: 437). The cut-off point along this scale differs between the 
main two TAM categories (imperfective vs. non-imperfective), but the lowest ranked 
category (non-specific NPs) is excluded in both. As definiteness is an all-pervasive fea- 
ture (irrespective of TAM), it can be said to be the single most important factor for the 
selection of n-marking in Coptic (cf. Sinnemäki 2014: 309). 


2.1 Imperfective tenses 


There is a TAM-based split in the distribution of object marking, to the effect that the 
n-marked form is obligatory with the imperfective tenses (present and imperfect) when 
the object is grammatically definite, and optional, it seems, with all other tenses (see 
82.2).* This means that the n-marked form was used with personal pronouns (4), demon- 
stratives (5), and NPs preceded either by the definite article (6) or the indefinite article (7) 
whenever the verb is in the present or the imperfect: 


(4) etbe peei pa-eiót maeie mma-i  / (*merit-Q) 
because DEM.M poss.1sG-father love ^ Acc-1scG /  "love-1sc 
"Because of this my father loves me' (John 10: 17) 

(5 auô  tes-ke-meeu ne-s-jou n-neei / (je-neei) 
and  Poss.3r.sc-also-mother  rMPr-3rsG-say  AcC-thisN /  "say-this.N 
"And also her mother was saying this' (Acts of Paul 11, 25) 

(6) anak ti-saune m-pa-eiót / ("souón-pa-eiót) 
1SG  1SG-know  Acc-Poss.isc-father /  "know-Poss.isc-father 
‘IT know my father’ (John 10: 15) 

(7) p-et-Sól n-ou-ónh abal / (*Sal-ou-6nh) 

DEF.M-RELshed  ACC-INDF-life out  / *shed-InDF-life 
“He who sheds a life' (Psalm-book 39, 26) 


The rule of obligatory marking also holds true for the possessive determiner (8) that 
is formed from the definite article marking the gender and number of the possessee, to 
which the appropriate personal marker for the possessor is affixed. 


^The rules governing object marking with the imperfective tenses were first described by Ludwig Stern 
(1880) before being elaborated by Pëtr Viktorovié Ern&tedt (Jernstedt 1927), for which reason they are known 
as the Stern-Jernstedt rule in Coptological jargon. 
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(8) hama=nde an  ne-s-maeie n-tes-sére 
at.the.same.time=but also 1MPr-3r.sc-love Acc-poss.3F.sG-daughter 
phalkónilla / — (*meri-tes-sére) 

Falconilla — / "love-Poss.3r.sc-daughter 
“at the same time she also loved her daughter Falconilla' (Acts of Paul 22, 17) 


Grammatically definite objects are marked irrespective of specificity. In general, both 
specific and non-specific NPs are n-marked. Exceptions to this occur whenever a light 
verb forms a verbal expression together with its syntactical object, as in the following 
example (9), with r-p-meeue ‘to remember’ (lit. ‘to do the remembrance’). N-marking is 
attested with light verbs in other dialects and texts (Layton 2000: 133). 


(9) ntaf  n-Sarp p-et-hn-plérouma p-et-ah-tóbh mmaf 
3M.SG  ADV-first DEF.M-REL-in-Pleroma DEF.M-REL-PST-pray ACC-3M.SG 
auó  e-f-r-p-meeue 
and  cCIRC-3M.SG-do-DEF.M-memory 
"Ihe one who is in the Pleroma was what he first prayed to and remembered’ 
(Tripartite tractatate 81, 30-32) 


There is one lexical exception to this pattern, where the definite article has no influence 
on object marking with the imperfective tenses. The verb ouós ‘to want’ is always used 
with a zero-marked definite object, as seen in (10)? 


(10) e-u-jpo m-p-et-(ou)-ouas-f / (l*ouó$ mma-f) 
CIRC-3PL-give.birth ACC-DEF-3PL-wish-3mM.sG / “wish  ACC-3M.SG 


“they begetting what they wish' (Tripartite tractate 64, 15) 


Language history has been evoked to explain this exception. It has been suggested 
that the distinction between the two different frames - wh3 n O ‘to look for’, contrasting 
with wh3 O, ‘to wish’ — was made at the earlier stage of the language (Demotic), and is 
preserved here (Depuydt 1993). In 85 I will offer an alternative functional explanation, 
which is based on an observation of Coptic data. 

When no determiner is present, the object is zero-marked (11). In such a case the noun 
is non-referential and non-specific, and does not reappear in the discourse. Zero-marking 
usually applies to indefinite pronouns as objects, but there are counter-examples, such 
as the one found in the first part of the sentence quoted in (12). 


(11) ti-Sp-hmat n-toot-k / (*S6p n-hmat) 
iSG-receive-grace  from-hand-Poss.2Msc  / “receive  ACC-grace 


‘I receive grace from your hand [i.e. ‘I thank you'] (John 11: 41) 


?In accordance with the Leiden Conventions for Papyrology, I use square brackets for restorations, and 
angled brackets for text omitted by the ancient scribe. 

5One may try to attribute a specific reading to the object in (12), which would be awkward, or else one can 
explain the use of n-marking with saune ‘to know’ in morphological terms (see $5.3). 
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(12) fsaune n-laue en  /  (?frsenouón-laue en) oude 
3M.SG-know  Acc-something NEG / *3m.sG-know-something NEG nor 
f-r-laue n-hof en an 
3M.sG-do-something GEN-thing NEG also 
‘Tt [sc. the fruit] knows nothing, nor does it do anything’ (Gospel of Truth 28, 
9-10) 


Ihave not found in my corpus of Lycopolitan Coptic any example of a proper noun as 
an object with the imperfective tenses, but data from other dialects show that n-marking 
must be used in such cases. As is apparent from the above, semantic and morphological 
definiteness triggers the marking of the object. 

Note that object marking is an innovation in the evolution of the Egyptian language. 
Afroasiatic case has not left any indisputable traces. Differential marking with the prepo- 
sition n started to appear around 1000 BC, first in the imperfective as a marker of the 
unbounded aspect (Winand 2015; cf. Engsheden 2006: 218-219), but it spread to the non- 
imperfective tenses in the first millennium AD. 


2.2 Non-imperfective tenses 


The rationale behind the alternating use of n/O with non-imperfective tenses is less clear. 
Coptic is rich in various TAM forms that are often labelled in an idiosyncratic way. What 
I call non-imperfective TAM forms covers every verbal form other than the present and 
the imperfect." 

The non-imperfective is a negatively-defined term that is used here as a label only: 
it encompasses the perfective as well as aspectually neutral forms. I include the future 
among the non-imperfective tenses. This differs from the tradition in Coptic linguistics 
to include the future, which is characterised by the infix na- (traditionally known as the 
first future”), along with the present and the imperfect, among the imperfective tenses.? 

With the non-imperfective tenses (including the future), n-marking appears option- 
ally with personal pronouns and NPs that have any of the three determiners: the indefi- 
nite, definite, or possessive articles. The common view of non-imperfective tenses among 
Coptologists is that “non-zero objects fluctuate (by speaker's stylistic choice)" (Layton 
2000: 132). One leading Coptologist has even stated that n-marking and zero-marking 
of the object “are generally understood to be functionally equivalent" (Emmel 2006: 41). 
At first glance, this appears to be true, because both constructions are found in more 
or less identical contexts, as in (13a)- (13b), where both phrases have the same verb in a 
terminative subordinate clause: 


"The group comprises past, future, optative, jussive, aorist, conditional, imperative, and a verb form called 
conjunctive that is used for subsequent action, etc. 

SThere are historical reasons for dividing Coptic TAM forms into two groups: the so-called adverbial/bi- 
partite/durative pattern (ie. my imperfective) vs. the verbal/tripartite/non-durative pattern (my non- 
imperfective). As the future tense form mostly appears in non-imperfective contexts, and shares its ar- 
gument realisation strategies with non-imperfectives, I believe that the Coptic future is better classified 
among the non-imperfective tenses (following Quevedo Álvarez 2001). For this reason, counts for the fu- 
ture are included among the non-imperfective tenses in this article. 
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(13) a. Sant-i-jak-pa-agón 
until-1sc-complete-Poss.1sc-struggle 
“until I complete my struggle' (Psalm-book 93, 9) 
b. Sant-i-jók m-pa-agón 
until-1sc-complete | Acc-Poss.1sc-struggle 


“until I complete my struggle' (Psalm-book 149, 19) 


It should be noted that employing DOM with non-imperfective tenses is a relatively 
late phenomenon. There are no unequivocal examples of it from Demotic, the language 
stage that immediately preceded Coptic. Object marking in Demotic is restricted to the 
imperfective tenses, so that the extension of DOM into the non-imperfective tenses must 
be considered as being only a little older than the oldest texts in Coptic. 


2.3 Previous research on DOM with the non-imperfective tenses 


To find out whether the two alternating constructions really are functionally equivalent, 
it is best to undertake a corpus-based statistical investigation. I have in two previous 
papers (Engsheden 2006; 2008) analysed the canonical gospels in Sahidic Coptic (the 
supra-regional dialect of the south). I argued that Coptic can indeed be analysed as an 
example of a language with DOM, and that the selection of the n-marked form was 
determined by both referentiality (or specificity) and topicality (Engsheden 2006: 209- 
212; Engsheden 2008: 329-335), while further possible factors included semantic features 
such as degree of affectedness and causation. No evidence was found for Coptic DOM 
being sensitive to animacy. 

A pertinent example for demonstrating that the marked form corresponds to the topic 
is found in the story of John the Baptist, whose head is what the story is about. Here, 
as elsewhere in this study, I mean by topic an aboutness topic, i.e. “the presupposed 
part of which pieces of information are conveyed" (Iemmolo 2010: 262), operating on 
sentence level. I cite here my original Sahidic example since the Gospel of Matthew is 
not preserved in Lycopolitan. Immediately before this passage, Salome has asked her 
stepfather the king to give her the head of John the Baptist (14a-14g): 


(4) a. a-f-lupei nci-p-rro emate 
PST-3M.SG-grieve AGT-DEF.M-king much 


b. etbe n-anaus=de mn n-et-néj nmma-f 
because DEF.PL-oath.PL=PTCL with  DEF.PL-REL-recline.sTATE with-3M.sG 


c. a-f-ouehsahne e-ti mmo-s na-s a-f-joou 
PST-3M.sG-command  to-give  ACC-3F.SG  to-3F.SG  PST-3M.SG-send 


d. a-f-fi n-t-ape n-ióhannés hm-pe-steko 
PST-3M.SG-Carry  ACC-DEF.F-head GEN-John  in-DEF.M-prison 


e. a-u-eine mmo-s hijm  p-pinaks 
PST-3PL-bring ACC-3F.SG on DEF.M-platter 
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f. a-u-taa-s n-t-Seere sem 
PST-3PL-give-3r.sG to-DEF.F-girl little 


g. a-s-eine mmo-s n-tes-maau 
PST-3F.SG-bring  ACC-3F.sG  to-POSS.3M.sc-mother 
"Ihe king grieved much. Because of the oaths and those who lay at table with 
him, he commanded to give it (sc. the head) to her, (and) he sent and 
beheaded John in the prison. It was brought on a platter and given to the 
little girl, (and) she brought it to her mother' (Matthew 14: 9-11) 


The head is reactualised in (14c) through an n-marked pronoun. In (14d) it is referred 
to by means of the repetition of the NP, and mentioned next in (14e) with an n-marked 
pronoun before it appears in (14g), once more with an n-marked pronoun. Note that the 
original Greek text here does not have any object pronoun, so there is no influence from 
the original on the use of n-marking. The omission of pronouns for the object in Ancient 
Greek correlates to high topicality (Luraghi 2003), which lends support to my analysis. 

The identification of topicality as a factor for the marking of the direct object was 
made by observing pronominal anaphora, and how they contribute to the discourse co- 
herence. It is more difficult to demonstrate a similar topical function for full NPs. As with 
extinct languages in general, it is often difficult to investigate discourse-pragmatic fea- 
tures because the competence of native speakers is replaced by a closed corpus of texts. 
It is however not surprising to discover that topicality is a factor for DOM, because it 
has been recognised as such in a wide range of languages (Dalrymple & Nikolaeva 2011: 
125-139; Escandell-Vidal 2009; lemmolo 2010; Shain & Tonhauser 2010). Accordingly, I 
posit that the identification of topicality as a factor in DOM, as suggested for Sahidic 
Coptic in my previous articles, is also relevant for Lycopolitan Coptic.’ 

Topicality relates to definiteness in such a way that topics are mostly definite, whereas 
it is less likely that indefinites appear as topics in discourse. It is often taken for granted 
that topics are specific, even though this is not a necessary condition, at least in Romance 
languages (Leonetti 2013: 138-140). The idea that topicality is the trigger for DOM in the 
non-imperfective tenses is made problematic because marking varies in frequency de- 
pending on the semantic verb type, as will be illustrated below in $5. Topicality cannot 
account fully for the variation n/Q, since there is no reason for some verbs to never 
be followed by a topical object. Of those verb types that disprefer n-marking with non- 
imperfective tenses, simple zero-marked nouns must also be able to function as topics 
as one would not expect to encounter any lexical restrictions on verb, depending on the 
topical function of the object. A similar uneven distribution of the marker a in Spanish 
led Delbecque to state "if the discourse function were the raison d'étre of the preposi- 
tional frame, then, the preposition should be able to appear after any transitive verb" 
(Delbecque 2002: 85). Consequently, topicality must work in conjunction with other fac- 
tors in order to produce DOM in Coptic. 


?Lycopolitan had a closer relationship to Sahidic than to any other Coptic dialect (Funk 1988; Kasser 2002: 
343). 
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Specificity also plays an important role for n-marking in the non-imperfective tenses. 
In the example in (15), the definite article is used in a generic sense without reference to 
any specific individuals and, hence, there is no marking on the object: 


(15) tehm-n-héke mn-n-et-mokh mn-n-cale mn 
invite-DEF.PL-poor  and-DEF.PL-REL-afflicted and-DEF.PL-lame and 
n-blle 


DEF.PL-blind 
“Invite the poor, the afflicted, the lame and the blind” (Luke 14: 13) 


This example is from Sahidic, but it is not difficult to find examples also in Lycopolitan 
Coptic (see 20). 


3 Data and methodology 


Lycopolitan Coptic (Nagel 1991) was rediscovered at the beginning of the 20th century 
through the discovery of manuscripts from Middle Egypt that date to the 4th and 5th 
centuries AD. Lycopolitan can be divided into the following subdialects (Table 2), for 
which conventional labels are used (cf. Kasser 2006: 418—420). 


Table 2: Lycopolitan subdialects 


L4  Manichaean texts from Medinet Madi (including Homilies; Kephalaia; 
Psalm-book)!? 

L5 Gospel of John (only Chapters 2-20) 

L6 Gnostic texts from Nag Hammadi; Acts of Paul 

L9  Manichaean texts from the Dakhla oasis 


Orthographical/phonological criteria form the basis of these subdivisions, with less at- 
tention being paid to grammatical features. L4 is the most important subdialect by size, 
and makes up almost two-thirds of the entire Lycopolitan text corpus. It is expected to 
grow as there is still unpublished material. The main representative of L5 is largely de- 
rived from a Sahidic Vorlage of the Gospel of John (Askeland 2012: 195-207). L9, known 
from texts discovered as late as in the 1980s, is the only subdialect to include original 
documentary material, whereas all preserved texts from the other subdialects seem to 
be translations from Greek, even though a translation directly from Syriac is sometimes 
invoked for some of the L4 texts. I have deliberately omitted two fragmentary leaves 
of the Pauline epistles, which have been classified as L3 (Kasser 2006: 419). Not only is 
the dialectal identification controversial, but the texts offer too little in matters of object 
marking to warrant their inclusion in this study. It should be noted that the internal 


10T have used the older editions (Allberry, Bóhlig, Polotsky), but these differ little with regard to objects from 
the still-incomplete re-edition in Corpus fontium manichaeorum. 
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relationships of the Lycopolitan varieties and their background are still a matter of dis- 
cussion. Some commentators have even questioned whether they should be classified as 
a discrete group among the Coptic dialects (Funk 1985, cf. Kasser 2002). 

To undertake a quantitative analysis of this corpus, I have built a relational database 
that includes all instances of the n/Ø variation from published Lycopolitan texts (with 
the exception of L3), which contains 7244 entries. The database contains only those 
syntactic contexts that would potentially allow DOM marking, so cases where the n- 
marking is part of the valency, such as amahte “to seize' or loans from Greek (see 3 
above), are not included in the counts in Table 3-Table 6. Heavily restored passages 
have been omitted. The fact that the corpus comes from a limited period and is relatively 
large, including several longer texts, makes Lycopolitan appealing for the study of Coptic 
DOM. 

Table 3 illustrates the difference between the number of attestations of n-marked con- 
structions in imperfective (present and imperfect) and non-imperfective tenses. As noted 
above, the n-marked construction is obligatory when the verb is imperfective (cf. 82.1) 
with personal pronouns, proper nouns, and grammatically definite nouns.!? The number 
of n-marked objects vs. the total number of occurrences is given in parentheses. 


Table 3: Percentage of marked objects in Lycopolitan in DOM-sensitive con- 
texts (affirmative sentences only) 


Personal pronoun Proper noun Poss. det. 3 NP Def. art. - NP Id£ art. + NP 


Imperfective 100% (640/642) : - 97% (63/65) 96% (100/104) 96% (22/23) 
Non-imperfective 5% (206/3793) 54% (21/39) 37% (162/442) 36% (321/889) 32% (89/282) 


The low figure for n-marked non-imperfective pronouns is a result of the preference 
for direct affixation of the clitic pronoun to the verb. The n-marking clearly dominates 
among proper nouns, whereas the zero-marked construction dominates among mark- 
ers of definiteness. The proportion of n-marked constructions lessens slightly between 
definite and indefinite articles, but it is unclear whether any significance should be at- 
tributed to this. It is questionable whether these categories should even be arranged in 
a hierarchy. 

When the data is broken down into Lycopolitan subdialects (Table 4) substantial differ- 
ences become apparent, not only between the subdialects themselves, but also between 
texts and even within texts. For example, only 29% of direct objects in the Manichaean 
Kephalaia (L4) that are preceded by a determiner (indefinite, definite, or possessive) have 
n-marking, whereas 75% are so marked in the Tripartite Tractate (L6). One should note 
that the mean for L9 is negatively influenced by the very low number of n-marked objects 
in non-literary texts. 


"'Lycopolitan texts make up only a tiny fraction of all existing Coptic texts; only 2.5% according to one 
estimate (Diebner & Kasser 1989: 59). 

12 The few exceptions include cases such as those mentioned in respect to example (9), involving light verbs, 
but likely also include simple errors in textual transmission. 
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Table 4: Frequency of n-marked construction with NP determined by 
(in)definite article or possessive determiner in non-imperfective contexts in 
Lycopolitan subdialects (affirmative sentences) 


Poss. det. + NP Def. art. + NP Idf. art.+ NP Mean 


L4 Homilies 37% (15/40 31% (20/65 33% (5/15) 34% 
L4 Kephalaia 30% (32/107) 35% (110/311) 22% (25/114) 29% 
L4 Psalm-book 41% (68/166) 31% (82/264) 30% (20/66) 34% 
L5 29% (14/49) 39% (22/56) 52% (13/25 40% 
L6 65% (26/40) 67% (73/109) 74% (23/3) 69% 
L9 18% (7/38) 18% (18/98) 10% (3/29) 15% 


The reason for the differences in marking between the various subdialects is currently 
unclear, but see the discussion in $6 for the possibility of a diachronic explanation. 


4 Semantic verb categories and DOM 


From the discussion above, it is clear that no single factor determines DOM in Coptic. 
Despite the general importance of definiteness and topicality in the non-imperfective do- 
main, neither is able to account for the phenomenon, and one is left with a great many 
n-marked direct objects for which an interpretation as a topic seems unwarranted. One 
way out of this dilemma is to extend the analysis to the immediate environment of the 
object, to inquire whether there was any lexical preference for one construction or the 
other, and whether such preferences had any semantic motivation. One should bear in 
mind that in the event of a disorderly spread of the n-marked construction from the 
imperfective tenses into the non-imperfective tenses, there should be no significant dif- 
ferences in frequency of n-marked vs. zero-marked constructions between the various 
verb types. As will be seen in $5, however, such differences are precisely what are ob- 
served in the corpus. The two constructions of the object are unevenly distributed and 
largely in agreement with the degree of affectedness in correlation to the verb types, 
which demonstrates that DOM in Coptic cannot be interpreted as a matter of style, as 
mentioned in 82.2. Similarly, in a discussion on object marking in Hindi and Ostyak, 
Dalrymple & Nikolaeva (2011: 13) reached the conclusion that the degree of affectedness 
does not play a role in DOM in those languages because “[o]ptionality is observed with 
exactly the same subjects and exactly the same verbs” Nor would one expect there to 
be lexical restrictions for the use of the marked construction in the optional marking of 
objects. Note that optionality does not mean free variation, and it is doubtful whether 
any free variation involving case-marking vs. zero-form really exists (cf. McGregor 2010: 
1615). Coptic is an example of what has been termed “semantically enabled optionality” 
(Kittilä 2005: 505). 

The degree to which the semantic relationship between the verb and its arguments can 
contribute to the understanding of DOM has been shown in several studies of Spanish. 
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It is generally held that animacy in conjunction with specificity triggers the use of the 
prepositional accusative a before the direct object in standard European Spanish. This 
explains the different object encoding in Spanish sentences where an animate definite 
object is preceded by a (16a), and an inanimate definite object is not (16b). 


(16) a. Vi a la mujer. 
See.PST1SG ACC DEF.F woman 


‘I saw the woman: 


b. Vi la mesa. 
see.PST.1SG  DEF.F table 


‘I saw the table? (von Heusinger & Kaiser 2003: 41) 


This traditional approach does not adequately explain the not-infrequent use of a be- 
fore inanimate objects (cf. von Heusinger & Kaiser 2003: 51). One way to explain such ir- 
regularities is to employ a model that takes account of the whole predicate frame, includ- 
ing the relationship between subject and object (Delbecque 2002; García García 2007). 
Thus, in case of a dynamic verb that is used transitively, one can note a two-sided ap- 
proach in which the agentive subject is conceived as reacting to the object, not only 
acting upon it (Delbecque 2002: 103). Marking vs. non-marking constructions represent 
different event structures. Differences in meaning can be approximated through transla- 
tion, as illustrated by abandonar Ø DO “to desert, drop, give up’ vs. abandonar a DO ‘to 
leave behind, abandon' (Delbecque 2002: 93). 

In their now-classic study, Hopper and Hopper & Thompson (1980) described transi- 
tivity as a scalar concept consisting of different parameters that can be arranged from 
high to low. Thus, telic action characterises a transitive clause more than an atelic ac- 
tion does, a volitional agent is more typical for transitivity than a non-volitional one, 
affirmative sentences are more likely to be transitive than negative sentences, and so 
forth. Another component in the original model was 'affectedness of O”, which is char- 
acterised as total vs. partial affectedness. The idea of transitivity as a scalar concept 
was elaborated in a study by Tsunoda (1985), in which he arranged verbs in seven cat- 
egories, and correlated these with case-frames from many unrelated languages and the 
degree of affectedness. The hierarchy can be reformulated as a scale: effective action > 
perception > pursuit > knowledge > feeling > relationship > ability. Verbs of effective 
action can be further divided into subtypes, depending on whether the verb is resulta- 
tive (‘to kill’, ‘to break’, ‘to bend”) or non-resultative (‘to hit’, ‘to shoot’, ‘to kick’, ‘to 
eat’). Perception verbs can likewise be divided into two subtypes, one more attained by 
verbal action, and the other less attained: ‘to see’, ‘to hear’, and ‘to find’ are considered 
more attained; ‘to listen’ and ‘to look’ as less attained. The model predicts that any cate- 
gory will be considered for object marking if any higher ranked (to the left in the scale) 
category is marked for transitivity. It has been said that the hierarchy correlates with 
both control and affectedness (Testelec 1998). These parameters were further studied by 
Malchukov (2005), who deconstructed Tsunoda’s original hierarchy in two dimensions. 
The first (sub-)hierarchy notes decreased patienthood (break > hit > look for > search > 
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go) and the second (sub-)hierarchy decreasing agenthood (break > see/know > like/fear 
> freeze/be cold). Such divisions of verb types following semantic principles are of inter- 
est for the present paper because they provide points of comparison for testing, to see 
whether the statistical arrangement in Table 5 can be matched to semantic features. 

It is probable that one can correlate DOM with the verb-type hierarchy. Some lan- 
guages for which affectedness has been claimed as an important factor for DOM are: 
Abui (Kratochvíl 2014), Ancient Greek (Riaño 2014), Djapu (Neess 2007: 205), Mongolian 
(Guntsetseg 2008: 64-65), and Spanish (von Heusinger & Kaiser 2003; 2011). The differ- 
ence between the partitive and the accusative in Finnish has also been explained in terms 
of partly affected vs. highly affected object (Hopper & Thompson 1980: 262; Naess 2004: 
1203; critically lemmolo 2013: 381). A practical application of verb-type hierarchies in 
relation to the argument realisation strategies in DOM can be found in a study by von 
Heusinger & Kaiser (2007), in which the authors were able to show how the frequency of 
the prepositional accusative increased over time, from Old Spanish up to modern Span- 
ish, based on an analysis of successive translations of the Bible. In that article, only three 
verbal prototypes were chosen for analysis: (a) to hurt/kill, (b) to see/find, (c) to put/take. 
The authors found it plausible that the lexical semantics of the verb were a driving force 
in the diachronic development of Spanish DOM, and they carried the analysis a step fur- 
ther in a subsequent study of twelve verbs, which represented the first five verb types 
from Tsunoda's verb-type hierarchy (von Heusinger & Kaiser 2011). They discovered that 
the Spanish data did not entirely agree with the hierarchy, inasmuch as verbs of feeling 
(querer “to love”, temer “to fear”), contrary to expectation, take a more transitive case- 
frame than verbs of perception, such as ver “to see’ or mirar ‘to look at’ (von Heusinger 
& Kaiser 2011: 612). The competition for agentivity between the participants in the event 
was mentioned as a possible cause for this (von Heusinger & Kaiser 2011: 613). 

In Coptic, the object of perception verbs is typically introduced by a preposition, 
mostly a (Sahidic e), which also has a directional meaning “to”. This explains why verbs 
of perception are poorly represented in the material analysed in $5.3. However, verbs of 
feeling are lower ranked than verbs of perception, and take a zero-marked object. This 
disagrees with Malchukov's two-dimensional model (2005: 81), which predicts that any 
intermediate verb-type category will display the same case-frame if both higher- and 
lower-ranked categories do so. Among the verbs of perception are neu 'to see' (e.g. 17) 
and sótme “to hear'.? There is no TAM-based split for perception verbs, as can be seen in 
a comparison between (17a), which has a verb in the imperfective tense, and (17b), which 
has a non-imperfective verb. 


(17 a. tn-neu  ara-k tinou  p-makarios 
IPL-see  at-2M.sG now DEF.M-blessed 


"We see you now, o blessed one' (Psalm-book 26, 12) 


BA thorough study of the valency of this verb is found in Emmel (2006). The occasional alternation between 
e and the usual construction with n/mma is different from the n-marked vs. zero-marked construction, and 
is not pertinent to the present study. 
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b. a-u-neu a-u-ouaine n-brre 
PST-3PL-see at-INDF-light GEN-new 


“They saw a new light’ (Psalm-book 196, 18) 


The preposition a also occurs before the object with some speech verbs, such as smou, 
‘to bless’ or moute ‘to call”. This correlation of argument realisation and verb type is so 
strong that is also used with loan verbs from Greek, such as the mental verb pisteue “to 
believe H 
(18) ari-pisteue a-p-ou[aein] 

do.ImpP-believe  to-DEF.M-light 


‘Believe in the light’ (John 12: 36) 


The government of perception verbs has historical roots in earlier phases of the Egyp- 
tian language. Indeed, the Coptic verbs introducing their object with the preposition 
a originally had diverse object marking strategies. It might be that the semantic devel- 
opment of late hieroglyphic nw “to look at’ (Depuydt 1988: 6-7), which later became a 
neutral verb of vision in Lycopolitan neu (Sahidic nau) ‘to see’, had operated on other 
perception verbs while retaining its government r (older Egyptian) » a (Coptic). 


5 Analysis 


In this section, I will review the ways that verb types relate to DOM in Lycopolitan Cop- 
tic. The data, which is drawn from the texts discussed in 83, is presented as a simple 
frequency list of verbs with n-marked and determined NPs, for which see Table 5. Very 
few proper nouns as objects with non-imperfective forms are attested, so conclusive re- 
sults can rarely be obtained from them, so proper nouns have been omitted from the 
analysis and the discussion. For convenience, I use the verb-type hierarchy proposed 
by Tsunoda (84). Tsunoda's division of verbs of effective action into two sub-categories, 
of resultative and non-resultative action, has also been expanded to include other cate- 
gories, although for the purposes of this paper I refer mainly to change-of-state verbs. I 
also use Tsunoda's classification of verbs as a heuristic tool without any attempt to re- 
fine the verb-type hierarchy itself. Of course, it is an oversimplification to make verbs fit 
into a single category without paying attention to how the presence of other arguments 
in the sentence can lead to recategorisation. 

All the transitive verbs listed in Table 5 are attested at least ten times in affirmative 
sentences of the Lycopolitan Coptic corpus. I define a verb as transitive (bi- or trivalent) 
if its object can be coded with the imperfective tenses in at least some contexts through 
n-marking or zero-marking. The table therefore only lists those verbs that participate in 
the n/O variation. As noted above, verbs of perception code their objects through the 
preposition a, and are therefore omitted. The object NP is always preceded by one of the 
determiners (definite article, indefinite article, or possessive determiner). The lemmas 
are listed in the first column in their Lycopolitan form (which differs only slightly from 


V Tn Greek, the object takes the dative and so cannot be explained as a calque of the source language. 
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Sahidic). The second column shows the number of the morphological verb class accord- 
ingto a modern standard grammar (Layton 2000: 153-157). Where no number is provided, 
it means that the verb should be considered irregular. The third column presents a stan- 
dard translation. The fourth column contains the percentage of n-marked constructions 
out of the total number of occurrences, and the ratio between n vs. Ø is shown in paren- 
theses. The fifth column provides the subdialect from which the attestations come; the 
dominance of L4 (Manichaean texts) is evident. The final column lists the subsection in 
this paper where examples of the verb in question may be found. 

Table 5 shows that the range of n-marking with determined NPs that take non-imper- 
fective tenses ranges from 0% to 92%, with a median of 36%. Even a quick perusal reveals 


Table 5: Distribution of n-marking with non-imperfective tenses and deter- 
mined NPs for most common transitive verbs in Lycopolitan Coptic (affirma- 
tive sentence only) 


Verb Class Translation Percentage (ratio)  Subdialect Section 
tóhme 1 to call 92% (1112) L4 5.1. 
hôtbe 1 tokill 91% (10/11) L4 5.1. 
tl bio 5 to humiliate (subdue) 75% (21/28) L4,L9 5.1. 
kót 2 to build 70% (7/10) L4 5.1. 
sótp 1 to choose 67% (6/9) L4 5.2. 
jók 2 to complete, finish 64% (16/25) | L4, L5, L6, L9 5.1. 
pórs 1 to spread out 6075 (6/10 L4 5.2. 
teko 5 to destroy 54% (7/13) L4, L6 5.1. 
jpo 5 tobeget 53% (8/15) L4, L5, L6 5.1. 
smine 7 to establish 48% (12/25) L4, L6, L9 5.1. 
tôbh 1 to implore, pray 44% (8/18) L4, L5, L6 5.2. 
ji - to take 41% (74/179) L4, L5, L6, L9 5.2. 
teho 5 to reach, set up 36% (8/22) L4, L6 5.2. 
nouje 2 to throw 36% (4/11) L4, L5, L6 5.2. 
tnnau 5 to send 28% (6/21) L4, L5, L6, L9 5.2. 
saune - to know 27% (5/21) L4, L6, L9 5.3. 
mour 2 to bind 22% (4/18) L4,L6 5.2. 
eire - todo 22% (37/168) L4, L5, L6, L9 5.2. 
ti - to give 21% (30/141) L4, L5, L6, L9 5.2. 
cine 7 to find 21% (19/91) L4, L5, L6, L9 5.3. 
shei - to write 20% (3/18) L4, L9 5.1. 
teouo 5 to send, produce, utter 20% (11/58) L4, L5, L6, L9 5.2. 
kó 2 to put, leave 17% (12/71) L4, L5, L6, L9 5.2. 
fi - to bear, carry 14% (10/69) L4, L6 5.2. 
eine 7 to bring 3% (1/37) L4, L5, L6, L9 5.2. 
šine 7 to seek, ask 0% (0/24) L4 5.4. 
meie - to love 0% (0/20) L4, L5, L6 5.5. 
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that the distribution of verbs shows agreement with semantically-defined verb types, 
ordered according to the affectedness hierarchy, especially at the upper and lower ends. 
There are considerable differences between the individual verbs with some (e.g. hótbe ‘to 
kill) predominately having the n-marked construction, while others (e.g. meie “to love”) 
exclusively take the zero-marked construction. In the imperfective tenses, all the listed 
verbs must take the n-marked construction with determined NPs (see 82.1). Morphology 
does not trigger the selection of the object construction. Most importantly, it contradicts 
the idea, first expressed by Steindorff (1894: 165), that the zero-marked construction is 
typical for the category of the fifth class, which contains etymological causatives. The 
Lycopolitan data show that verbs belonging to this class are subject to the effect of lexical 
semantics to the same degree as other verb types. 


5.1 Verbs of effective action (resultative) 


There is a strong correlation between marking and verbs of effective action. The subject 
is highly agentive and volitional, exercising full control over the action. The object is fully 
affected and undergoes a change of state. Among them one finds hótbe ‘to kill’, jók ‘to 
complete”, and teko “to destroy”, but verbs of creation are included in this group as well. 
The median for n-marked verbs of effective action is 64%, which means that resultative 
verbs of effective action are predominantly n-marked. A representative example of n- 
marking with a verb of effective action is (19): 


(19) a-u-hótbe n-n-sabeue 
PST-3PL-kill ACC-DEF.PL-Wise.PL 
“They killed the wise men” (Homilies 80, 30) 


It seems significant that the only example of a zero-marked object with hótbe ‘to kill’, 
which is quoted in (20), has a generic referent. As noted at the end of 82, non-specific 
objects take the zero-marked construction. 


(20) n-t-he n-hn-róme e-u-na-hatbe-hn-moui 
ADV-DEF.F-manner  GEN-INDF.PL-man  CIRC-3PL-FUT-kill-InDF.PL-lion 


“in the manner of men who are about to kill lions’ (Psalm-book 205, 30) 


At first sight, the verb tóhme “to call”, which scores highest in selecting the n-marking 
in Table 5, does not seem to be an ideal candidate for demonstrating the relevance of 
affectedness; under normal circumstances the object of ‘to call’ is not affected by the verb 
action. However, Manichaean cosmogony provides a likely explanation for this deviance 
from the expected pattern. The call directed at the various Aeons is a metaphor for them 
being called into existence as a counter-measure against the approaching advent of Evil. 
This creational aspect can be highlighted by the translation ‘call forth’ (cf. Kasser 1991). 


(21) pa-iót p-ouaine et-talél... a-f-tóhme n-n-aión 
POss.1sG-father DEF.M-light  REr-be.glad  Psr-3M.sc-call Acc-DEr.PL-aeon 
m-p-ouaine... a-f-tóhme n-n-aión n-t-eiréné... 


GEN-DEF.F-light  Psr-3M.sc-call ACC-DEF.PL-aeon GEN-DEF.F-peace 
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a-f-tóhme n-n-aión m-p-scraht... se-[h]atp 

PST-3M.sG-call ACC-DEF.PL-aeon  GEN-DEF.M-rest  3PL-be.in.peace.sTATE 
tér-ou — se-ti-mete 

all-3PL 3Pr-give-satisfaction 

“My father, the glad Light... He called forth the Aeons of the Light into 
existence... He called forth the Aeons of the Peace... He called forth the Aeons of 
the Rest... They are all in peace and satisfied' (Psalm-book 203, 3-23) 


Note that the sequence in (21) contains inanimate objects, whereas animates are nor- 
mally expected with this verb, but it is unclear whether animacy has any significance 
for the selection of the object construction. This cannot easily be resolved because the 
boundaries between animate and inanimate were not sharp in the Manichaean universe. 
One might also wish to consider the topical status of the objects, which is was announced 
in the title of the psalm: "Concerning the Father and all his Aeons and the Stirring of 
the Enemy” The Aeons appear again in the discourse as a plural subject in the last line 
quoted above. 

Related to this are examples of n-marking with the verb t" bio, the usual translation of 
which is “to humiliate’, but are better translated in examples such as (22) as ‘to subdue’ 
(cf. 1a) when followed by an n-marked NP, to signal a higher degree of affectedness. 


(22) a-u-t'bio m-p-keke 
PST-3PL-subdue Acc-DEF.M-darkness 


"Ihey have subdued the darkness' (Kephalaia 35, 5) 


This does not prevent zero-marked constructions from appearing with similar mean- 
ings (23): one cannot readily identify the two Coptic predicate frames with different 
verbs in translation, as one might by using Delbecque's model for Spanish (see 84). 


(23) n-t-he je a-p-sarp n-róme t'bio-p-keke 
ADV-DEF.F-Way  PTCL  PST-DEF.M-first GEN-man  subdue-DEF.M-darkness 


“in the way the first man subdued the darkness” (Kephalaia 49, 4) 


In some cases, such as in (24), not even the combination of a definite reference and an 
affected object produces n-marking. The object hób (lit. ‘thing’), which does not recur in 
the discourse, can be regarded as synonymous to the head of an indefinite relative clause 
"that which [lit. the thing] you have given to me to do? What the work consisted of is 
not explained. Whether discourse factors play a role is unclear. The object is in this case 
non-topical. 


(24) a-ei-jak-p-hób abal  nt-a-k-tee-f néi 
PST-lsc-finish-DEF.M-thing PTCL  REL-PST-2MSG-give-3M.sG  to.1SG 
a-tr-a-ee-f 
to-CAUS-1SG-d0-3M.SG 


‘I have finished the thing you have given me to do’ (John 17: 4) 
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Even though the general trend is clear, there are significant differences between the 
verbs in this group that require explanation. It is difficult to see any reason why hótbe 
“to kill’ and teko ‘to destroy’ would take 91% and 54% of n-marked objects respectively. 


5.2 Verbs of effective action (non-resultative) 


Many of the verbs listed in Table 5 are action verbs where the actor retains control of the 
action expressed by the verb. The object is little affected, but may undergo limited physi- 
cal movement (e.g. ‘to spread out’, ‘to take’, ‘to throw’, ‘to set up’, ‘to bring’). The median 
percentage of n-marking in this group is 25%, so n-marking is clearly the exception. This 
group could be further divided into subcategories on semantic grounds, but this would 
obscure the relevant point, which is the overall dependency of object marking on the 
affectedness hierarchy. 

It can be difficult to identify where the difference between the n-marked construction 
vs. the zero-marked construction lies. These difficulties are illustrated in (25a-25b). 


(25) a. a-f-nouje n-hn-jór| me] ha te-f-staurósis 
PST-3M.SG-throw ACC-INDF.PL-allusion concerning POSS.3M.sG-crucifixion 
“He made allusions to his crucifixion' (Homilies 44, 17) 
b. a-u-nouj-ou-halu[sis] a-pe-f-mout 
PST-3PL-throw-INDF-chain  to-POSS.3M.sc-neck 


"Ihey put a chain around his neck' (Homilies 48, 21) 


Both examples are from the same text, which tells how Mani, the founder of the re- 
ligion named after him, suffered martyrdom in AD 277. In the first example (25a), the 
n-marked object (‘allusions’) is inanimate and indefinite, and does not seem to be more 
affected than the object in (25b). Pragmatic factors may be relevant, because the 'allu- 
sions' in (25a) may be a reference to Mani's comments on his martyrdom in the following 
line. The marking would then indicate that the indefinite noun 'allusions' should be re- 
garded as specific, serving as a referential anchor (cf. von Heusinger 2002). In (25b), on 
the other hand, the “chain” does not appear again in the following discourse. To under- 
stand the importance of extent discourse-informational factors for n/Ø variation, one 
would need to explore discourse persistence in a systematic fashion, which would re- 
quire time-consuming manual processing. Due to the non-narrative character of most 
texts in the corpus, there is little referential persistence with regard to the direct object, 
so that referential tracking becomes difficult. 

Within this group are a few verbs for which the subject exercises full control over the 
action, and that have a non-affected object, such as sótp ‘to choose’, tóbh ‘to implore’, 
and teouo ‘to utter’. In this context, the verb tóbh ‘to implore’ has different case-frames 
depending on the animacy of the object. On the one hand, when the object is inanimate, 
such as in (26), n-marking is clearly preferred. On the other hand, zero-marking is used 
with animate objects, as in (27), in a way that is reminiscent of other speech verbs (cf. 
teouo ‘to utter’ and Sine ‘to ask”). 
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(26) e-u-a-tóbh m-p-kó abal | n-n[ou-nabe] n-tot-f 
FUT-3PL-FUT-ask  ACC-DEF.M-give away  GEN-POSS.3PL-sin from-hand-3M.sc 
m-p-noute 
GEN-DEF.M-god 


“They will ask for the forgiveness of their sins from God’ (Homilies 23, 8) 


(27) a-k-tabh-pek-iét 
PST-2MSG-ask-Poss.2Msc-father 
"You have asked your father' (Psalm-book 44, 11) 


It is also worth mentioning the verb eire “to do’. It frequently occurs as a light verb in a 
few common expressions, such as r-p-meue 'to remember” (lit. “to do the remembrance”), 
where the incorporation of the object underlines its low referential status (see 9). The 
compound is understood synchronically as a verb, and can even be followed by an n- 
marked object, as in (28). 


(28) e-s-na-r-p-meue mma-ou  nci-ti-ekklésia 
FOC-3F.SG-FUT-do-DEF.M-memory  ACC-3PL AGT DEF.F-church 
“the Church will remember them' (Tripartite Tractate 135, 25) 


The presence of an object changes the telicity of the verb, which is another factor for 
high transitivity according to Hopper & Thompson (1980). One may contrast examples 
with a nuance ‘to do s.o/s wish’, i.e. to fulfill it (29), against other examples where the 
activity is unbounded, as is the case of 'spending (lit. doing) time' (30). 


(29) hn  ou-spn3ópe mn-ou-[c] lam $a-u-eire m-p-ók 
in INDF-sudden and-INDF-rapid AOR-3PL-do  ACC-DEF.M-delight 
n-hét m-pou-jais 
GEN-heart | GEN-POSS.3PL-lord 
"Suddenly and rapidly they fulfill the desire of their lord' (Kephalaia 51, 16-17) 


(30) alla  a-u-r-pou-kairos tér-f e-u-sópe hn-<ou-thli»psis 
but  PsT-3PL-do-POSS.3PL-time all-3M.sG  CIRC-3PL-become in-INDF-distress 


‘But they spent all their time falling in distress’ (Kephalaia 150, 29) 


Although the general trend between resultative and non-resultative action seems clear, 
there are considerable differences between the Lycopolitan subdialects regarding the 
frequency of the n-marked construction, as seen for a selection of verbs in Table 6. 

This reveals very different proportions of n-marking in the various subdialects of Ly- 
copolitan Coptic. It is notable that n-marking is virtually non-existent in L9, especially in 
non-literary texts. It is difficult to tell what this signifies. There are also remarkably low 
n-marking percentages for several verbs in L4. In this subdialect, zero-marking appears 
to constitute the normal transitive construction for verbs of non-resultative action. It 
seems as if n-marking is more common in L5, but the totals are rather low for that sub- 
dialect. By contrast, the percentage of n-marking is high in L6, with no real distinction 
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Table 6: Subdialectal variation in n-marking for a selection of verbs 


L4 L5 L6 L9 


ji‘take’ 30% (29/98) 72%(13/18) 88%(29/33) 10% (3/29) 
eire ‘do’  23%(26/115) 50% (4/8) 25% (6/24) 5% (1/18) 
ti'give 10% (9/92) 50% (2/4) 82% (18/22) 5% (1/21) 
kó'put 6% (3/53) 44% (4/9) 83% (5/6) 0% (0/3) 
ficarry 7% (3/46) 10% (1/10) 86% (6/7) 0% (0/5) 


in treatment between resultative (as in 85.1) and non-resultative action verbs. Thus, n- 
marking is clearly the norm in the L6 dataset, the only clear deviation from the trend 
being eire “to do', partly due to its frequent use as a light verb. If one omits objects as 
complements in complex predicates, which are zero-marked (as in 28), one still does not 
arrive at more than 46% n-marking with eire. 


5.3 Verbs of perception/cognition 


As stated above (end of 84), the object of perception verbs is mostly introduced by the 
preposition a. This explains why Table 5 only lists two examples of perception verbs 
participating in the n/Q variation (saune ‘to know”, cine “to find”). The agent exerts no 
control on the action and the object is unaffected. 

The behaviour of saune, which has 27% n-marking, is unique to Lycopolitan: I do not 
know any examples of a n-marked object together with this verb in any dialect other 
than Lycopolitan. In the imperfective (31) the stem is saune (Sahidic sooun): 


(31) [e]peidé f-saune n-t-gnósis 
since 3M.sG-know  ACC-DEF.F-gnosis 


“Since he knows the gnosis' (Kephalaia 233, 26) 


The verb saune is evidently a secondary form, having developed out of a verb form 
often called the stative, which expresses a resultative state (Peust 2013: 163). The morphol- 
ogy of this verb is quite complex and presents many variants (overview in Vycichl 1983: 
202). The verb saune itself, like similar verbs expressing knowledge in earlier Egyptian 
dialects, was originally an inchoative mental verb, not a verb of state, that had the basic 
meaning “get to know”. It is only through the spread of the stative form that the verb 
evolved into a verb of state, similar to one meaning of the English “to know”. In dialects 
other than Lycopolitan, saune (and predictable variants thereof) is used indiscriminately 
with imperfective and non-imperfective tenses. With non-imperfective TAM forms, NPs 
as direct objects are almost invariably zero-marked (and thus different from Lycopolitan). 
Originally, the stem may have been souón/snouón, and it appears as such in Lycopolitan 
with non-imperfective tenses, with either n-marking (32) or zero-marking (33).* In other 
dialects, this allomorph is used with zero-marking. 


D Saune is also possible with a non-imperfective, when there is no object. 
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(32) a-i-snouón n-ta-psuché 
PST-1sG-know  ACC-POSS.1SG-soul 


‘I have known my soul’ (Psalm-book 56, 26) 


(33) tote  e-u-¿an-souón-p-iót 
then  COND-3PL-COND-know-DEF.M-father 
“Then if they know the father” (Gospel of Truth 24, 31) 


A comparatively high percentage (21%) of n-marked objects are found with cine “to 
find”. In Tsunoda's original model, the verb ‘to find’ was listed among perception verbs 
based on the argument realisation of ‘to find’ in, for example, North Caucasian languages 
(cf. Ganenkov 2006), though ‘to find’ can also be a verb of perception in English (Simon- 
Vandenbergen 1999: 423). It is not easy to see which semantic reason could favour either 
one object marking strategy over another for this verb. Compare the following, where 
the objects are near synonyms expanded through a genitive adjunct, and both times have 
a verb in the past tense. 


(34) a. L.] a-fcine n-t-sbió m-pf-his[e] 
PST-3M.sG-find ACC-DEF.F-requital | GEN-POss.3M.sG-toil 
*... he has found the requital for his toil’ (Homilies 83, 19) 
b. je | a-i-cn-p-beke m-pa-hise 
for PsT-I-find-DEF.M-reward | GEN-POSs.1sc-toil 
‘for I have found the reward of my toil’ (Psalm-book 93, 30) 


The fragmentary context of (34a) makes it impossible to observe anaphoric behaviour. 
The selection seems to be truly optional. 

Four instances where the object of cine is n-marked can be interpreted as being topical. 
This interpretation follows from the repetition of the object each time in short, explana- 
tory nominal sentences. 


(35) a-i-cine n-t-mró t-mró te t-entolé... 
pst-1sG-find  ACC-DEF.F-harbour DEF.F-harbour COP DEF.F-command 
a-i-cine n-n-ejéu n-ejéu ne p-ré 
PST-1SG-find ACC-DEF.PL-ship DEF.PL-ship  COP.PL  DEF.M-sun 
mn-p-ooh a-i-cine n-ou-héu e-mn-ase [nhét-f] 
and-DEF.M-moon  PsT-19G-Éind ACC-INDF-gain  CIRC-NEG-loss in-3M.SG 
‘I found the harbour. The harbour is the Commandment... I found the ships. The 
ships are the sun and the moon... I found a gain wherein there is no loss... 
(Psalm-book 168, 1-9) 


There is a further consideration, because the second consonant in cine is identical to 
the object marker n, and this could play a role for the common use of zero-marking. It is 
true that phonology can sometimes override semantic-pragmatic parameters, as happens 
sometimes with the Spanish a (Kliffer 1995: 108), in order to promote the zero-marked 
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form. But the percentage of attestations for the n-marked construction differs between 
cine (21%) and Sine (0%), which has the same rhyming pattern, so the idea of phonological 
influence is unlikely. 


5.4 Verbs of pursuit 


In this category the subject has a low degree of control and the object is unaffected. The 
list comprises a single verb of pursuit, Sine “to ask’, which here is zero-marked (36). In 
other dialects (Akhmimic, Mesokemic), where the percentage of n-marking is higher, the 
object of this verb can be n-marked. 


(36) a-ke-mathétés $n-p-apostolos 
PST-other-disciple  ask-DEF.M-apostle 
"Another disciple asked the apostle' (Kephalaia 208, 15) 


5.5 Verbs of feeling 


Here the subject lacks control, the object is not affected, and the verb expresses a state. 
The verb meie (Sahidic me) ‘to love’ is incompatible with n-marking in the non-imperfec- 
tive tenses, a feature that appears to be shared by all Coptic dialects.!° See (4) and (8) for 
examples with the imperfective. Its antonym maste 'to hate', not included in the list 
above, also avoids n-marking in the non-imperfective. 


(37) a-u-mrre-p-eau gar n-n-róme 
PST-3PL-love-DEF.M-glory for GEN-DEF.PL-man 
‘for they loved the glory of men’ (John 12: 43) 


In this context, it is appropriate to consider ouós ‘to want’, “to wish’. As mentioned at 
the end of §2.1, this verb is the sole exception to the rule that definite objects must be 
n-marked with the imperfective tenses. A problem for the historical explanation referred 
to earlier is that the difference between wh3 n O ‘to look for’ and wh3 O ‘to wish’ is found 
only in Demotic (Depuydt 1993), meaning that it had disappeared before the spread of 
n-marking into the non-imperfective. Once the former expression had disappeared, it 
would have been possible for ouós to have taken part in the expansion of object marking. 
A semantic analysis based on affectedness offers an alternative, functional explanation, 
which holds true synchronically. Thus, semantics may have blocked ouós from acquiring 
object marking in the non-imperfective, and it may have had a similar effect on the 
imperfective. 


161 know of only one possible example of this verb with a marked direct object: p-e-Sa-u-ka-ou-koui de na-f 
ebol e-Sa-f-me n-ou-koui “The one to whom little is forgiven, he loves only a little’ (Sahidic Luke 7: 57), in 
which the object is focalised by means of the preposition. It therefore does not seem to be an example of a 
differential context. 
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6 Discussion 


The foregoing section lends support to the idea that Coptic DOM can be successfully 
analysed, based upon a view of transitivity as a scalar concept involving several semantic 
features (Hopper & Thompson 1980). In Coptic, definiteness, specificity, topicality, and 
affectedness seem to act together to create a high degree of transitivity, and interact in 
triggering n-marking. How the various factors contributing to DOM in Coptic relate to 
each other is open to question. The study of the development of DOM in Coptic is still 
in its formative stages, and the following remarks are therefore preliminary, and have 
no immediate bearing on Coptic dialects other than Lycopolitan. 

Definiteness is a factor for object marking with all TAM forms, although in the non- 
imperfective tenses it leads only to optional DOM (cf. 84). I posit that marking spread 
across definite NPs more-or-less simultaneously, and not stepwise from one definite 
category to the next, because the difference in percentages of n-marked nouns seems 
negligible when compared to determined NPs (see Table 3). This last fact speaks against 
a spread along the definiteness hierarchy scale as claimed, inter alia, for the Spanish 
prepositional accusative (Aissen 2003). The topical status of the marked objects may 
have been a secondary development, which followed from semantic definiteness. A top- 
ical function is best visible in the phonologically heavier form mma, which was used for 
pronouns (see 14) that are semantically definite. The n-marked object would receive sep- 
arate stress from the verb, and thus in an iconic way reflect the saliency of the object. If 
so, n-marking might be described as a topicalisation strategy through right-dislocation, 
even though the right periphery is not recognised as a position for topics in Coptic. It is, 
however, difficult to identify topicality in NPs as objects by studying referential coher- 
ence, because the non-narrative character of most Lycopolitan texts is such that objects, 
once mentioned, do not commonly persist over several sentences, and their behaviour 
cannot be observed. Substitution or question tests for topicality are difficult to apply 
without a native speaker's intuition. It can be expected that the effect of topicality for 
overruling the expected selection of n vs. Y would be greatest for non-effective action 
verbs (see 85.2), because this is the only group in which one notes significant differences 
between the subdialects (see Table 6). These differences, ultimately affecting the percent- 
age and their placement in the list in Table 5, indicate that not all factors operated in an 
identical manner in all subdialects. 

The frequency list of Lycopolitan transitive verbs and their construction with non- 
imperfective tenses, in Table 5, shows that object marking was generally in agreement 
with Tsunoda's affectedness hierarchy, particularly at the upper and lower ends. Over 
90% of examples of a typical action verb with an affected object (85.1), such as ‘to kill’, 
take n-marked objects, while a typical verb of feeling (85.5), “to love”, takes 0%. The more 
the object is affected, the more likely it is to receive n-marking. It is more difficult to 
assess the large group of non-effective action verbs (85.2). 

The correlation between marking, which is an innovation of Egyptian-Coptic lan- 
guage history, and the affectedness hierarchy with the non-imperfective, must reflect 
synchronic priorities. It is conceivable, a priori, that the marking spread randomly from 
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the imperfective to the non-imperfective without any functional basis. However, the 
difference in marking frequency by verb type suggests that this was not the case. If it 
was, one would be at a loss to explain why some verbs do not have the marker with 
the non-imperfective tenses, but uniformly do with the imperfective ones. Note that my 
interpretation of Lycopolitan DOM is a counter-example against the generalisation that 
asymmetric DOM systems are not regulated by affectedness (Iemmolo 2013). The TAM- 
based split that has differing rules for the imperfective and non-imperfective tenses 
under similar syntactic conditions (obligatory vs. pragmatic-semantically determined 
DOM) already speaks against the general validity of this hypothesis. 

At first glance, there seems to be no particular information-structural reason why the 
Manichaean texts (L4) should have far fewer n-marked direct objects than the Gnostic 
texts (L6). The difference between L4 and L6 is significant, as indicated by a chi-square 
test with Yates” correction that yields a statistical significance at p < 0.001. Since the 
n-marked construction was an innovation, one may feel inclined to assume that the dif- 
ference between the percentages in L4 and L6 would reflect an ongoing spread of the 
marker into the non-imperfective tenses. This would, in principle, mean that texts with 
a low incidence of the n-marked construction are from an older stage of language de- 
velopment, and texts with a high incidence of the n-marked construction are from a 
more recent stage. It is plausible to conceive that the use of n as a topic-marker was 
extended to non-topical contexts, so that more and more determined and specific ex- 
pressions would ultimately receive the marker within the non-imperfective domain (cf. 
Dalrymple & Nikolaeva 2011: 208). Affectedness may have been the path along which 
the construction spread. It might be argued, on the basis of the more frequent use of 
n-marking in L6, that the role of affectedness was then gradually diminished as definite- 
ness alone, irrespective of any eventual topical role of the object, would often trigger 
marking. This seems to move towards a clearer separation of a group of verbs (action 
verbs) that favoured n-marking from verbs of feeling that favoured zero-marking, indi- 
cating a lexically-based selection of object construction (cf. lemmolo 2013: 390). 

It is difficult to offer support for such an assumed diachronic scenario, or to refute 
it through independent criteria, since the dating of manuscripts, let alone of the texts 
themselves, is very insecure. But diachronic studies on DOM in Spanish show a similar 
span in object marking as that observed between the Lycopolitan subdialects, and these 
appear to have evolved over two centuries. Thus, in El Cantar de mio Cid from the 13th 
century, only 36% of animate direct objects are overtly marked (data from Brenda Laca, 
quoted in von Heusinger & Kaiser 2011: 602, yet two centuries later objects are marked 
under identical conditions at 7072-907; (von Heusinger & Kaiser 2011: 610). Conversely, 
such variation does not need to be understood as a reflex of language diachrony. This 
can be seen in Old Japanese, where NPs from contemporary prose texts of 10th century 
are marked at 44%-72% (Sadler 2002: 248). Data from Portuguese also show that there 
can be substantial quantitative differences between contemporary texts (Delille 1970: 85, 
119-120). Furthermore, the letters from L9, in which object marking is sparingly attested, 
are originals and can be securely dated to the latter half of the 4th century AD. This 
makes them, for all practical purposes, contemporary with the text copies of L6, in which 
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n-marking is the dominant pattern. Thus, variation in object marking was acceptable 
concurrently. Such cases are a reminder that differences between subdialects should not 
necessarily be interpreted as a reflex of diachronic development. Despite this, the blurry 
picture of Coptic DOM is likely to reflect an evolving DOM system. 

It is worth reasserting the lack of any role for animacy in Coptic DOM, to judge from 
the Lycopolitan corpus used in this paper. It is not possible to find any parallel alignment 
between verb hierarchy and animacy that is in a way similar to what von von Heusin- 
ger & Kaiser (2007) suggested in their analysis of Spanish. They observed a decrease in 
object marking from the verbs ‘to kill’, ‘to see’, ‘to consider”, and “to have’, which were 
analysed as representatives of different verb classes. Their conclusion that "the partic- 
ular ranking depends on the animacy requirement imposed by the verb on the direct 
object” (von Heusinger & Kaiser 2011: 605) is not cogent because it was based on a study 
of no more than four to six verbs. Searching the animate vs. inanimate objects listed in 
this database reveals no such animacy ranking. Rather, the Coptic data indicate that the 
affectedness scale is parallel to the decrease of control by the actor on the process of the 
verb. Furthermore, Coptic DOM calls into question the general validity of any theory 
that relies on the need for disambiguation, on syntactic or semantic grounds, between 
the agent and object as a motivation for DOM (e.g. Aissen 2003; de Swart 2005; Primus 
2012). The word order SVO means that there was no need for disambiguation of the core 
participants. 


7 Conclusion 


The present study supports the claim that Coptic DOM in the non-imperfective domain 
has a functional motivationand is not arbitrary. I do not claim to have formulated a set of 
inviolable rules. Instead, I have shown tendencies that seem to be shared by all Lycopoli- 
tan subdialects (except for L9), for which the n-marking number is too low to permit any 
satisfactory conclusions. The clear differences in n-marking percentages between the Ly- 
copolitan subdialects does, however, confirm their relative independence. It is apparent 
from the analysis that semantic factors act in conjunction with discourse-structural fac- 
tors in Lycopolitan Coptic. The quantitative analysis in 85, on the alternation of marking 
of NPs as objects through n/® with non-imperfective tenses, has revealed striking differ- 
ences in marking between the semantic verb categories. There is an overall agreement 
with Tsunoda's verb-type hierarchy: a highly-affected object with a dynamic action verb 
(e.g. hótbe ‘to kill’) is likely to receive n-marking; a little-affected object is less likely to 
receive n-marking (e.g. nouje 'to throw"). A low n-marking percentage is found for the 
few verbs of perception/cognition that take the n/@ variation (saune ‘to (get to) know’, 
cine 'to find"). Verbs of feeling (e.g. meie 'to love") uniformly have a zero-marked con- 
struction. 

Although generalised findings from an analysis of Lycopolitan cannot be extended to 
Coptic as a whole, it should be apparent that it is relevant to examine the semantics of 
verb types is a relevant subject in future studies of DOM in that language. 
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circ Circumstantial clause NEG Negative 

marker PL Plural 
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An exhaustive search of Old Japanese object NPs associated with weak floating quantifiers 
and question-focussed object NPs containing interrogative words confirms the suggestion 
made in Yanagida & Whitman (2009), confirmed by Frellesvig et al. (2015), that Old Japanese 
had differential object marking (DOM) with specificity (defined by Frellesvig et al. 2015 as 
D-linking) as a necessary condition. Testing the same hypothesis on Early Middle Japanese, 
however, shows that this condition no longer obtained by the Heian Period. The resources 
for the expression of specificity and the set of conditions for differential object marking 
clearly changed over this span of the history of the Japanese language. 


1 Introduction 


Throughout its attested history Japanese has exhibited variable object marking: Some 
object NPs are marked by the accusative case particle wo,! others are not. We give two 


The accusative case particle wo has been in use through the history of the language. Its phonemic shape 
changed to /o/ around the year 1000 AD due to regular sound change; we will refer to the Japanese ac- 
cusative particle as ‘wo’ throughout the paper, except when citing examples, which have ‘wo’ or o de- 
pending on the age of the source. Like accusatives in many other languages, the Japanese accusative has 
functions in addition to marking direct objects, mainly: marking adjuncts (path and source) and marking 
subjects raised to object and subjects in a few absolute constructions. 


Bjarke Frellesvig, Stephen Horn € Yuko Yanagida. A diachronic perspective 
on differential object marking in pre-modern Japanese: Old Japanese and Early 
Middle Japanese. In Ilja A. Seržant & Alena Witzlack-Makarevich (eds.), Di- 
| achrony of differential argument marking, 163-186. Berlin: Language Science Press. 
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simple examples from Old Japanese in (1)-(2).2? 


(1) [np kwomatu ga sita no kaya wo] kara-sane 
smallpine GEN under GEN grass ACC cut-RESP.OPT 


'(I) want (you) to cut some of the grass under the small pine’ (MYS 1.11) 


(2) akami-yama [np kusane Y] kari-soke 
Akami-mountain grass cut-remove 


“cutting and removing grasses at Mount Akami..? (MYS 14.3479) 


Phenomena suggesting the existence of differential object marking (DOM) in Old 
Japanese (OJ) have long been noted, and hypotheses about the trigger for DOM in OJ 
have been developed and refined in recent decades (Motohashi 1989; Yanagida 2006; 
Yanagida & Whitman 2009) to the point where a robust formulation of a condition on 
DOM in OJ has now been proposed and tested in a survey of OJ object noun phrases of 
a few selected types Frellesvig et al. (2015). 

In the present study we present an expanded and exhaustive survey of OJ along the 
same lines as in Frellesvig et al. (2015), and then proceed to extend these same tech- 
niques to a body of texts representative of the immediately following historical variant 
of Japanese, Early Middle Japanese (EMJ), in order to ascertain whether the DOM system 
of the Japanese of the Asuka and Nara periods (as represented by texts from 712 CE to 
797 CE) persists into the Heian period (as represented by texts from 900 CE to 1110 CE). 

First we define the necessary condition for triggering DOM in OJ, viz. specificity de- 
fined as D-linking (see 92 below). Next we describe how we used the Oxford Corpus of 
Old Japanese Frellesvig et al. (2014) to determine that reference to this condition con- 
tributes to an observationally adequate description of DOM in OJ. Next we present the 
methods and results of a similar survey of EMJ using the Historical Corpus of Japanese 
National Institute for Japanese Language and Linguistics (NINJAL) (2014), which show 
clear and significant differences in object marking between Old Japanese and the imme- 
diately following period of Early Middle Japanese. In the discussion in $4, we summarize 
and discuss these findings and identify areas for further research. 


2 The conditions for DOM in OJ 


In the data set used for the study on DOM in OJ (described in more detail below) there 
are in total 4094 direct object noun phrases (NPs). Of these object NPs, 1946 (47.5%) are 


?Like modern Japanese, Old Japanese is head-final, has postposed particles, verbal suffixes in derivational 
and inflectional morphology, and pervasive pro-drop (whence many of the examples we cite have no overt 
subject). Old Japanese has an extensive inventory of inflecting verbal suffixes, which are not found in mod- 
ern Japanese, expressing aspect, tense and mood. Old Japanese does not have a nominative case particle; 
subjects are sometimes bare and sometimes marked by one of the two genitive case particles no and ga. 
In modern Japanese ga has become a nominative case particle, whereas no remains a genitive in modern 
Japanese. See further Frellesvig (2010) about premodern Japanese. 

3Examples are transcribed in a time-appropriate phonemic transcription (see Frellesvig 2010: 33, 176 for 
simple transcription guidelines). 
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marked with the accusative case particle wo. It is evident that there is variation in object 
marking (cf. also (1), (2) above), and the initial question is whether that is dependent on 
some factor or combination of factors. Following Frellesvig et al. (2015), we find that the 
alternation found in Old Japanese is related to a non-inherent discourse-based argument 
property. In this respect the distribution of OJ accusative case marker wo is similar in 
many respects to that of the Turkish accusative case suffix -i for direct objects (Enc 1991). 
Also note that object marking alternation in OJ is found in wh-NPs under question focus 
(e.g. idure wo ka wakite sinwopamu “Which (of them) shall I praise, separating it out?”), 
which means that wo-marking does not imply topichood. Rather, we find that a necessary 
condition for DOM in OJ is a weak form of specificity which we define in terms of D- 
linking, the working definition of which we set out as follows: 


(3) D-linking: A relationship between an NP and a definite discourse referent, whereby 
the possible reference of that NP is restricted. 


Pesetsky (1987) used the term D(iscourse)-linking to characterize wh-NPs such as 
‘which student’ as having a special property due to their membership in a definite super- 
set, this being, moreover, a property with consequences for syntax. Generally, X in which 
X is linked with a definite discourse entity insofar as ‘which’ is uninterpretable without 
a presupposed superset, thus such D-linked wh-NPs are weakly “specific” (Cinque 1990; 
É. Kiss 1993). As an example with an overt superset rather than a merely presupposed 
one, consider the expression Among the students in this year's cohort, which is the best? 

Extending this idea, it is clear that this weak specificity can accrue to wh-NPs other 
than those containing ‘which’: For example, the contextual material accompanying the 
wh-NP ‘whom’ in the expression Among the students in this year’s cohort, whom should 
we trust? is sufficient to render that wh-NP D-linked. The phrase “what else” in the ex- 
pression What else do you want? is D-linked to a definite discourse entity by the relation 
of exclusion, as that narrows the possible reference of the NP ‘what else 3 

Further extending the idea, we see that the same kind of weak specificity can be a 
significant property of indefinite NPs that do not contain wh-words at all, established 
through the same kinds of D-linking relations, a typical example being that of a definite 
possessive NP complement, as in the farm's products, but potentially established in a 
variety of ways (e.g. a man on the bus, a limb off the tree, another glass of beer, etc.). We 
also stipulate that D-linking is not an irreflexive relation. Thus a definite NP is D-linked 
through the relation of identity it has with itself. By this we also include co-indexing 
through previous mention and pronominal reference as a way to establish D-linking. 
Thus we account for the distribution of accusative case marking on both definite objects 
and indefinite specific ones by reference to one principle. 

Needless to say, there are also many ways for the definite discourse referents upon 
which these various relations depend to find their way into the common ground: pre- 


“Tt follows that DOM conditioned by D-linking can trigger an interpretation of weak specificity in a wh-NP 
that would otherwise be construed as non-specific (as Dalrymple & Nikolaeva 2011: 210-211 observe for 
Persian). While all of the examples of wo-marked wh-NPs in our OJ data are accompanied by contextual 
material for D-linking (see $2.2), this is valuable new information for the interpretation of wo-marked 
objects in general in Old Japanese texts. 
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vious mention, ostension, presupposition accommodation, uniqueness, etc. In OJ, the 
effect of weak specificity can be seen in the near-minimal pair (1) and (2) given above. 
The object NP in (1) is modified by an NP complement containing the NP kwomatu ‘small 
pine’ and is marked by accusative case particle wo. Because kwomatu (the small pine’) is 
definite, as the context in the poem shows, the reference of the object NP kaya wo ‘grass’ 
is at least weakly specific due to the D-linking relation which maps the whole object NP 
to the definite discourse entity denoted by the NP complement. Accordingly, we trans- 
late the object NP here as a having at least a partitive relation: some of the grass under 
the small pine, but the NP also could potentially refer to all the grass under the small pine. 
The marking of the object NP conforms to the fact that it has at least weak specificity, 
which property satisfies a necessary condition for object marking. By contrast, the ob- 
ject NP in (2) kusane ‘grass’ is unmodified and unmarked, consistent with non-specific 
reference, which we translate here with an English plural common noun ‘grasses’. When 
we look at the wider context of the expression in (2) we see that a non-specific amount 
of grasses are cut in order to open a space for lying down. The presence and absence 
of object case marking seen respectively in (1) and (2) corresponds to the presence and 
absence of specificity in the reference of the NPs. 

This analysis is supported by Yanagida's (2006) observation that a great preponder- 
ance of unmarked object noun phrases in OJ are composed of unmodified common nouns, 
while wo-marked object NPs are frequently modified by NP complements as in (1) or by 
relative clauses. Overt modification, while restricting the possible reference of the NP so 
modified, does not by itself ensure a D-linking relationship, but reference to a definite 
discourse entity within the modifying material is one way to establish D-linking, which 
in turn licenses object marking. Needless to say, stronger types of specificity, including 
epistemic specificity and definiteness, also license object marking in OJ. 

Furthermore, as observed by many (Matsuo 1944; Matsunaga 1983: 48; Miyagawa 1989; 
Yanagida 2006), object-marking in OJ is associated with leftward movement, so that, 
for example, in this SOV language, wo-marked object NPs co-occurring with genitive- 
marked subject NPs appear to the left of those subject NPs (e.g. example (16) below), 
with extremely few exceptions. Yanagida & Whitman (2009) identify this as a movement 
to a position outside of the domain of existential closure in the verb phrase. This is a 
phenomenon common to specific object NPs, as described by Diesing (1992), inter alia. 
In contrast, bare, unmodified, common noun-headed object NPs in OJ commonly appear 
adjacent to the verb (Yanagida 2006; Yanagida & Whitman 2009). These distributions 
conform very well with what we observe here.” 

We also note that object NPs composed of personal pronouns (Wrona & Frellesvig 
2010, inter alia) and NPs modified by demonstratives are also fairly regularly object 
marked. However, we also find clearly specific object NPs that are unmarked. For ex- 
ample, we found 47 object NPs containing demonstrative ko ‘this’ at some structural 
level. All of these NPs are specific, and indeed many of them are definite, but while 25 
are accusative case marked as predicted, e.g. (4), 22 of them are bare, e.g. (5). 


Note, however, that there are rare examples of unmodified wo-marked NPs that appear adjacent to the 
selecting verb (possibly cases of vacuous movement). Conversely leftward movement does not imply speci- 
ficity, as wh-items in question focus are regularly left-shifted, and many of these are non-specific. 
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(4) ko no miki wo  kami-kyemu pito pa 
this GEN wine Acc chew-musthave person TOP 


“as for the person who must have brewed this wine' (KK 40) 


(5) Yamato no  womura no take ni sisi pusu to tare ka 
Yamato GEN Womura GEN peak DAT game lie comp who Foc 
ko no koto opomapye ni mawosu? 


this GEN content Emperor DAT say 


"Ihat deer lie on the peaks in Womura in Yamato —who is it that says this thing 
to His Majesty?' (NSK 75a, b) 


The pattern for object marking in OJ outlined above may be summarized as follows: 


(6) a Accusative case marked objects are specific; 
b non-specific objects are not accusative case marked; 
c not all specific objects are accusative case marked. 


This leads us to form the following hypothesis: 


(7) Condition on DOM in Of: Specificity is a necessary condition for object marking 
in OJ, the weakest form of specificity being D-linking. However, specificity is not 
a sufficient condition for object marking in OJ. 


In this paper we focus on this condition and its falsifiability, but do not to any sig- 
nificant extent discuss the — important - issue of when specific objects in OJ are not 
accusative case marked. See, however, $4 for some remarks on this. 

The hypothesis that some kind of specificity is a necessary but not sufficient condi- 
tion for DOM is falsifiable by finding an unambiguously non-specific NP which is also 
accusatively marked. Unfortunately, there is no linguistic pattern in OJ that can be said 
to be an unambiguous and categorical marker of non-specificity, making it difficult to 
search for counterevidence to our hypothesis on the basis of linguistic forms in an elec- 
tronic corpus. But there are at least two classes of object NPs which, other things being 
equal, have reference that is normally non-specific: (i) object NPs associated with weak 
Floating Quantifiers (FQs) of the form [numeral + classifier],° and (ii) object NPs con- 
taining wh-words (except for idure ‘which’, discussed in more detail below) and having 
question focus." As it is, variable object marking is attested among both of these classes 
of NPs, suggesting that under marked conditions both types of NP can have specific 
reference. In order to establish a systematic and exhaustive method which can also be 


SFor example, in the Modern Japanese (NJ) expression Dooro de sika o rop-piki mita ‘On the road I saw six 
deer’, a non-specific object NP sika ‘deer’ is associated with the FQ rop-piki 'six-animal'. For this first class 
of object NPs, in special cases the reference can be specific, and indeed even definite, but the function of 
the FQ ceases to be weakly quantifying in such cases (discussed in more detail below) (Kim 1995, inter alia). 

TFor example, in the NJ expression Mado kara dare o mimasita ka 'From the window, whom did you see?' 
the object NP dare o ‘whom’ is under question focus. For this second class of object NPs, the reference is 
at most only weakly specific, and that only under special conditions (discussed in more detail below). 


187 


Bjarke Frellesvig, Stephen Horn & Yuko Yanagida 


applied to following stages of the language, with much larger volumes of material avail- 
able, we therefore examined all attestations of these two classes of object NPs using the 
Oxford Corpus of Old Japanese (OCOJ, Frellesvig et al. 2014), with the aim of demonstrat- 
ing whether a D-linking relation would be retrievable for the wo-marked object NPs. If 
not, such examples could constitute counterevidence to the hypothesis about DOM. 

The data set we used for the OJ survey was extracted from the September 2014 version 
of the OCOJ (Frellesvig et al. 2014), which primarily uses sources from the Nihon koten 
bungaku taikei (Iwanami shoten, 1957-1962) as critical editions. We used a sub-corpus 
comprising all extant poetic texts from 712 CE to 797 CE, drawing material from the 
following sources: Kojiki kayo, Nihon shoki kayo, Fudoki kayo, Bussokuseki-ka, Shoku 
nihongi kayo, Manyoshi. It is thought that some of the poetic texts in these works are 
considerably older than the earliest date of compilation. The volume of the corpus is 
4,979 poems, comprising 89,419 words. 

We looked at the two types of NPs which would under normal, unmarked conditions 
be non-specific. As predicted, the exhaustive examination of these object NPs showed 
that: 


(8) a There is a correspondence between accusative case marking and specific in- 
terpretations for these two types of NPs (corroborated by the presence of 
contextual clues); and 

b NPs of these two types receiving unambiguously non-specific interpretations 
(again corroborated by contextual clues) are bare. 


Details for both types of NP are presented together with examples in the sections that 
follow. In the remarks that follow we only discuss the reference of marked object NPs, as 
only these serve as potential counterevidence to the hypothesis for a condition on DOM 
in OJ (7). 


2.1 Specificity of object NPs associated with weak floating quantifiers 
in OJ 


Out of the attested 100 expressions with the form of weak FQs in the data set, we found 4 
attestations of FOs both indexed with wo-marked object NPs and functioning as adverbial 
modifiers of the predicate selecting their respective host NPs (see examples (9)-(11)). In 
all cases the reference of the host NP was in fact definite. When an FQ that, other things 
being equal, would be interpreted as weakly quantifying (e.g. a cardinal FQ) is paired 
with a definite host NP, the resulting expression is construable in two ways; either as 
meaning 'n-members of a definite superset; (i.e. ‘n of them’, where the FQ behaves as 
what we might call a partitive quantifier), or as a cardinally specified universal quantifier 
(e.g. ‘both’, i.e. “all of them, with a cardinality of 2”). The interpretations presented in the 
examples below are derived accordingly. We present all four examples in the following. 

The definiteness of the host NP in example (9) derives from the fact that the relative 
clause modifying the head noun kamwi ‘god’ serves to define a definite superset: ‘those 
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gods known as Chinese Tigers”. The FQ ya-tu ‘eight-thing’ functions as a partitive quan- 
tifier “eight members of the superset’. 


(9) karakuni no  twora to ipu  kamwi wo ikedorini 
China GEN tiger COMP say god ACC live.take.as 
ya-tu tori-moti-ki 


eight-thing take-hold-come 
‘taking and bringing by capturing live eight of those gods called Chinese 
Tigers..” (MYS 16.3885) 


In (10) the definiteness of the host NP sinokipa ‘arrow’ derives from a combination 
of metaphor and previous mention, explained in detail in Frellesvig et al. (2015). The FQ 
functions as a cardinally specified universal quantifier ‘both’. 


(10) ...adusa-yumi  yu-bara puri-okosi  sinokipa wo  puta-tu 
catalpa-bow  bow-belly swing-raise arrow Acc two-thing 
ta-basami  panati-kyemu pito si — kuti-wosi 
hand-pinch loose-must.have person RES  mouth-regrettable 


“Deplorable, the person who (...) must have raised a bow, pinched both those 
arrows, and shot them away!’ (MYS 13.3302) 


The definiteness of the two host NPs in example (11) u ‘cormorant’ is inferred from 
the method of fishing referred to in this poem, which involves using exactly eight cor- 
morants carried four to a basket, two baskets to a pole (see Frellesvig et al. 2015: 202). 
Thus, the two FQs function as cardinally specified universal quantifiers ‘all eight’. 


(11) kami tu se ni u wo  ya-tu kaduke simo 
upper GEN stream DAT cormorant Acc eight-thing make.dive lower 
tu se ni u wo  ya-tu kaduke 


GEN stream DAT cormorant Acc eight-thing make.dive 


“making all eight of [my] cormorants dive in the upper reaches, making all 
eight of [my] cormorants dive in the lower reaches... (MYS 13.3330) 


Again, the reference for every wo-marked host NP of FQ is definite. Given our defi- 
nition of D-linking in (3) and the stipulation that definite NPs are D-linked by reflexive 
identity, we determine that all wo-marked object NPs associated with FQs in OJ are at 
least weakly specific in reference. Accordingly, for this class of NPs, no counter-evidence 
to the hypothesis is found. 


5It is well-known that NPs of the form X to iu Y “Y which is called X’ regularly form definite descriptions. 
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2.2 Specificity of object NPs containing WH-words with question 
focus 


The set of wh-words in OJ is as follows: 


(12) WH-words in O]: ta, tare ‘who’; idu ‘where’; iduku ‘where’; idura ‘where (abouts)’; 
idupye ‘which direction”; idure ‘which’; idusi ‘which side’; iduti ‘which direction’; 
ika ‘how’; iku ‘how many’; ikubaku ‘how much’; ikuda ‘how much’; ikupisa(sa) 
‘how long ago’; ikura ‘how much’; ikutu ‘how much’; itu ‘when’; nado, ado ‘why’; 
na, nani ‘what’; ani ‘how’; uremuso ‘why’ 


The OCOJ has 469 occurrences of wh-words. Out of these, we identified 70 that are 
contained in object NPs, of which 21 are wo-marked. Of these wo-marked NPs containing 
wh-words, there are 18 which have question focus (i.e. are themselves wh-NP objects). As 
for the remaining 3 object NPs, they do not have question focus, either due to the focus 
being discharged within a complement clause embedded in a relative clause, or due to 
the wh-word functioning as a quantifier, or both. For example, in (13) below, the wh-word 
(itu ‘when’) is contained in an adverb NP (itu si ka mo) of a complement clause (itu ... 
mimu to) embedded in a relative clause (itu ... omopisi) modifying the head (apasima) 
of an object NP. The force of the wh-word is discharged at the level of the complement 
clause. The whole utterance forms a yes/no question. 


(13) [wp itu si ka mo mi-mu to omopi-si | apa-sima wo] 
when RES roc ETOP see-wil comp think-spsr Awa-island acc 
yoso ni ya  kwopwi-mu 


afar DAT roc yearn-will 


‘Shall [I] have to yearn from afar for Awa Island, about which [I] thought, 
“When will [I] see it?”?’ (MYS 15.3631) 


Non-question focus object NPs such as these are excluded from consideration, because 
they can easily have definite reference, as does in fact the example in (13). 

Out of the wh-words in OJ, listed above, only the following appear in the formation of 
wo-marked object wh-NPs: ika how”; ta, tare ‘who’; nani ‘what’; dure ‘which’. Under nor- 
mal, unmarked conditions, NPs containing such wh-words (with the exception of idure 
‘which’) would be non-specific. However, significantly, in these 18 examples, the NPs 
containing them are wo-marked and in fact weakly specific in reference. For example, in 
(14) immediately below, the reference of the wh-NP headed by yosi ‘opportunity’ is asso- 
ciated with a definite event that occurred by chance, the D-linking established through 
the relationship of exclusion: ‘what manner of opportunity other than by chance’. In all 
18 examples (14)-(31), the wo-marked object wh-NP is accompanied by contextual mate- 
rial by which that NP is construable as related to a definite discourse entity. While we 
cannot include here all the considerations by which the judgments on reference status 
were made due to lack of space, we reflect as much as possible in the translations. 
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2.2.1 ika how” 


(14) 


2.2.2 
(15) 


(16) 


(17) 


(18) 


(19) 


tamasakani wa ga missi pito wo ika  nara-mu 
by.chance I GEN  see-sPST person acc how  cor-will 
yosi wo motite ka mata  pito-me mi-mu 


opportunity Acc holding roc again one-glimpse  see-will 


"Ihe person whom I met by chance -having what other manner of opportunity 
is it that [I] will see a glimpse of her again?' (MYS 11.2396) 


ta/tare ‘who’ 


yamato no takasazinwo wo nana yuku  wotomye-domo tare wo 
Yamato GEN Takasazino Acc seven go girl-PL who acc 
si maka-mu 

RES wrap-will 

“As for the seven maidens walking along the plain Takasazi in Yamato - whom 
(of them) will [you] wed?’ (KK 15) 


nagatukwi no sigure no | ame no yama-gwiri no asibuseki 
9th.month GEN shower GEN rain GEN  mountain-mist as fretful 

a ga mune fa wo miba yama-mu 

I GEN breast who acc seeif stop-will 

“As for my breast which is fretting like the mountain mist of the rain showers of 
the 9' month, if [I] see whom (other than you) shall it quieten?' (MYS 10.2263a) 


maywone  kaki tare wo ka  mi-mu to omopitutu 
eyebrow scratch who acc roc see-wil cow»  think.while 
ke-nagaku  kwopwi-si imo ni ap-yeru kamo 


days-long yearn-spst beloved DAT meet-sTAT SFP 

“Scratching [my] eyebrow, thinking, ^Whom (other than you) am [I] about to 
see?" here [T] am meeting my beloved (i.e. you) whom [I] have longed for day in 
and day out!” (MYS 11.2614b) 


kapyeru beku toki pa  nari-kyeri miyakwo nite ta ga 
return ought time TOP become-mpst Capital cop who GEN 
tamoto wo ka wa ga makuraka-mu 

sleeve acc roc I GEN |lie.upon-will 

"Ihe time has come for [us] to return. In the capital, the sleeve of whom (other 
than my departed wife) shall I use as my pillow?' (MYS 3.439) 


asigara no  ya-pye-yama kwoyete | imasi-naba tare wo 
Ashigara GEN eight-fold-mountain Crossing come-prv.if who acc 
ka kimi to mitutu  sinwopa-mu 

Foc lord comp seeing  praise-will 

“If [you] cross the eight-fold mountains of Ashigara, then whom (else) shall [I], 
thinking [it] to be my lord, admire?' (MYS 20.4440) 
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2.2.3 
(20) 


(21) 


(22) 


(23) 


(24) 


(25) 
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nani ‘what’ 


kasuga.nwo no pudi pa  tiri-nite nani wo ka mo 
Kasuga.field GEN wisteria TOP  scatter-PFV.GER what ACC FOC ETOP 
mi-kari | no pito no  worite kazasa-mu 
PFX-hunt GEN person GEN  breaking.off don-will 


"Ihe wisteria flowers on Kasuga fields having scattered, what else shall the 
hunters break off and wear on their heads?' (MYS 10.1974) 


kokoro sape  matur-eru kimi ni | nani wo ka mo  ipa-zu 
heart even offerup-srar lord DAT what Acc FOC ETOP say-NEG 
ipi-si to wa ga nusumapa-mu 

say-sPST COMP I GEN  steal-will 


"To you whom [I] have given the very meaning (my very heart), what (else) 
would I steal from you by saying, “[It] is a thing which was said without 
speaking"?' (MYS 11.2573) 


moti no pi ni  sasi-iduru tukwi no takatakani kimi 
mid.month GEN day Dar direct-come.out moon as refinedly lord 
wo  imasete nani wo ka  omopa-mu 
Acc making.come what acc roc think-will 


“Having you come resplendently like the moon that comes out on the 15th of the 
month, what (else) could [I] wish for?' (MYS 12.3005) 


yama-gapi ni | sak-yeru sakura wo  tada 
mountain-saddle par  bloom-srar cherry.blossom Acc just 

pito-me kimi ni mise-teba nani wo ka  omopa-mu 
one-glimpse lord par show-PFv.if what Acc Foc think-will 

‘Tf [I] managed to show my lord just once the cherry blossoms that bloom in the 
saddle of the mountain, what (else) could [I] wish for?' (MYS 17.3967) 


ipye ni yukite nani wo katara-mu asipikwino 

home Dar going what Acc recount-will (pillow.word) 
yama-pototogisu pito-kowe mo ` nakye 

mountain-cuckoo  one-chirp ETOP cry.IMP 

"Mountain cuckoo, sing even one note! Going home, what (other than that) shall 
[I] recount?' (MYS 19.4203) 


ima-sarani nani wo ka  omopa-mu uti-nabiki kokoro pa kimi 
now-newly what Acc roc think-wil prx-liedown heart Top lord 
ni yori-ni-si monowo 

DAT  depend-Prv-sPsr  given.that 

“At this late date, what more could [one] ask for, given that [my] heart, lying 
down, has given itself over to you?' (MYS 4.505) 
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(26) ame-tuti wo terasu pi-tukwi no kipami naku aru 
heaven-earth Acc illuminate sun-moon as limit lacking be 
beki monowo nani wo ka omopa-mu 


ought given.that what Acc roc think-will 


‘Given that [it] must have no limit, just as the sun and moon which illuminate 
heaven and earth, what else could [one] wish for?” (MYS 20.4486) 


(27) sipo pwi-naba tama-mo kari-tume ipye no imo ga 
tide ebb-prv.if jewel-weed cut-gatherimp home GEN beloved GEN 
pama-dutwo kopaba nani wo  simyesa-mu 
beach-souvenir beg.if what {acc proffer-will 
‘When the tide goes out, cut and gather some jewel-seaweed. If my darling at 
home asks for a beach souvenir, what (other than that) would [we] proffer?’ 
(MYS 3.360) 


2.2.4 idure ‘which’ 


The wh-word idure ‘which’ is inherently specific, and NPs headed by idure (e.g. in (29) 
below) or with idure as a direct NP complement to the head (e.g. in (28), (30), (31) below) 
are specific. There are 4 examples of an object wh-NP formed with idure as a head or as 
a direct NP complement, and as expected all are wo-marked. 


(28) asipikwino tama-kadura no kwo  kyepu no goto idure no 
(pillow.word) jewel-vine GEN child today GEN like which GEN 


kuma wo  mitutu  ki-ni-kyemu 
bend acc seeing come-Prv-must.have 


‘Oh child of the jewel-vine, seeing which bends in the mountain road must 
[you] have come here, as [I come] today?' (MYS 16.3790) 


(29) idure wo ka wakite sinwopa-mu 
which Acc roc separating  praise-will 


^... Which shall [I] praise, separating [it] out? ..” (MYS 18.4089) 


(30) watatumi no dure no  kamwi wo  inoraba ka yuku 
sea.god GEN which GEN god ACC supplicateif roc go 
sa mo ku sa mo pune no paya-kye-mu 


way ETOP come way ETOP boat GEN  fast-be-will 


"Which god of the sea is it that, if [I] beseech it, the boat will be fast both on the 
way out and the way back?' (MYS 9.1784) 
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(31  ame-tusi no  idure no kami wo  inoraba ka | utukusi 
heaven-earth GEN which GEN god ` Acc beseechif Foc dear 
papa ni mata ` koto-twopa-mu 


mother DAT again  word-ask-will 


"Which of the gods of heaven and earth is it that, if [I] beseech it, [I] will speak 
again to my dear mother?' (MYS 20.4392) 


Thus, for object NPs containing wh-words and having question focus, which under 
normal, unmarked conditions would be expected to have non-specific reference, all wo- 
marked examples are demonstrably D-linked and thereby specific, so that no counter- 
evidence to the hypothesis about the condition on DOM in OJ (7) is found. 


3 Does the DOM system of OJ persist into EMJ? 


In this section we will address the question of whether Early Middle Japanese (EMJ, 
900 CE to 1110 CE) exhibits the same system of DOM as OJ, concluding that it does 
not. We will show that in EMJ both specific and nonspecific objects may be wo-marked 
or bare, unlike OJ which disallows non-specific wo-marked objects. We will first show 
three examples (all taken from Makura no Soshi ) which show that EMJ, like OJ, had 
wo-marked specific objects (32), bare specific objects (33), and bare non-specific objects 
(34). Following that we will present the results of an investigation of whether EMJ had 
non-specific wo-marked objects. 

In (32) the object denotes particular body parts of previously mentioned people, and 
as such the reference is D-linked and specific. The object NP is wo-marked. 


(32) Specific, wo-marked object NP 
uta-rezi to youi site tuneni usiro o 
hit-pass.willnot comp preparation doing constantly behind acc 
kokoro-dukawi  si-taru  kesiki 
heart-dispatch  do-srAT sight 


“the sight of [them] constantly guarding [their] behinds taking care lest [they] 
be struck’ (Makura no soshi, 3, Shinpen Zenshü, vol. 18, p. 28) 


In (33) previous mention of augi 'fan' and putokorogami “pocket paper” is seen in the 
immediately preceding context, establishing D-linking through the relation of previous 
mention, and yet both object NPs are bare. 


(33) Specific, bare object NP 
augi tatau-gami | nado yobe makura-gami ni 
fan  folding-paper etc. lastnight pillow-head DAT 


oki-sikado onodukara  pika-re tiri-ni-keru o motomuru 
put-spst.although naturally ` pull-PAss  scatter-PFV-MPST ACC search 
ni kurakereba ikade ka wa mi-mu  idura  idura 


DAT dark.because how roc Top see-will where where 
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tataki-watasi | mi-idete augi putaputato tukawi putokoro-gami 
pat-cross see-putting.out fan (mimetic) use pocket-paper 
sasi-irete makari-na-mu to bakari koso iu rame 
stick-putin go.home-pFv-will COMP RES FOC say EXT 


‘Although [he] had put [his] fan and folded paper and such at the head of his 
pillow the night before, when [he] searches [for them] among the things that 
naturally became disturbed and scattered, it being dark, how shall [he] ever find 
them?— saying, “Where? Where?” patting the whole area, and finding them, [he] 
uses [his] fan, “woosh-woosh, and sticking [his] pocket-paper in, what [he] 
would surely say is something like, “[I’ll] be going now”. (Makura no sõshi, 61, 
Shinpen Zenshi, vol. 18, p. 117) 


The example in (34) is the first entry in a list entitled "Despicable things". There is no 
previous context other than the title ofthe list, and given the public nature of the list and 
the negative evaluations levied on the items therein, anything more than a non-specific 
reference would be unthinkable. The object NP is bare. 


(34) Non-specific, bare object NP 
nadeu koto naki pito no we-gati nite mono 
any.in.particular thing lacking person cen laugh-tending cor thing 
itau iwi-taru 
extremely say-STAT 
‘A person who has nothing to commend himself, smugly saying things volubly’ 
(Makura no soshi, 26, Shinpen Zenshi, vol. 18, p. 65) 


As mentioned, (32)-(34) conform to the observed OJ distribution of specificity and 
case marking. In order to examine whether the DOM system of OJ is also found in EMJ 
we investigated systematically the existence of wo-marked non-specific objects, which 
were disallowed in OJ. We used the methodology outlined above and examined object 
NPs associated with weak FQs and object wh-NPs with question focus, using the Heian 
Japanese sub-corpus of the Historical Corpus of Japanese (NINJAL 2014) in conjunction 
with the Chünagon search application available from the National Institute of Japanese 
Language and Linguistics. The Heian Japanese sub-corpus of the Historical Corpus of 
Japanese represents prose and poetry in texts produced between 900 CE to 1110 CE, us- 
ing texts from the Shinpen Nihon koten bungaku zenshú (Shogakkan, 1994) as critical edi- 
tions. The Heian sub-corpus is composed of the following texts: Kokin wakashú, Tosa 
nikki, Taketori monogatari, Ise monogatari, Ochikubo monogatari, Yamato monogatari, 
Makura no soshi, Genji monogatari, Murasaki Shikibu monogatari, Izumi Shikibu mono- 
gatari, Sarashina nikki, Sanuki no suke nikki, Heichu monogatari, Kagerofu nikki, Tutumi 
Chünagon monogatari. The texts are primarily prose, with some poetry. The sub-corpus 
contains 738,153 words. 

Exhaustively examining NPs in EMJ fitting the same description as that for OJ outlined 
above, we found that the Condition on DOM in OJ in (7) does not hold for EMJ. The 
evidence for this conclusion comes in the form of wh-marked non-specific object NPs. 
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In a situation where there are no overt forms that can be used to unambiguously mark 
NPs as having non-specific reference, a demonstration of this evidence relies on close 
examination ofthe previous context, and various considerations about the most plausible 
interpretations of NPs that appear in the text. 


3.1 Specificity of object NPs associated with weak floating quantifiers 
in EMJ 


A search of the sub-corpus for object NPs associated with weak FQs in EMJ yielded 
results from texts produced between 900 CE (Taketori monogatari) and 1010 CE (Genji 
monogatari). We found 512 expressions of the form [numeral+classifier] in Heian texts. 
Among these we found 80 examples associated with object NPs. Of these 80 object NPs, 
8 are accusative case marked, and if the OJ system of DOM were to persist in EMJ, we 
would expect all 8 wo-marked object NP hosts of FQs to be specific in reference. However, 
ofthe 8 wo-marked objects, 3 are arguably non-specific. We give all three examples below. 
For example, in (35) below, a simile is drawn to a hypothetical situation in which two 
plums are stuck in the place where eyes should be. There is no mention of these plums 
in previous context, and they have no links to definite discourse referents. 


(35) karouzite | oki-agari-tamap-eru wo mireba kaze ito omoki 
barely sit.up-rise-RESP-STAT ACC see.when illness very heavy 
pito nite para ito  pukure konata kanata no me ni 
person cop belly very swell  thisside thatside GEN eye DAT 
pa  sumomo wo  puta-tu tukeru yau nari 
TOP plum Acc two-thing attach appearance cop 


‘.. looking at [him] as [he] barely managed to raise himself, [he] was like 
someone with a terrible cold, [his] belly swelled up and it was as if [someone] 
had stuck two plums to [his] eyes on the one side and the other’ (Taketori 
monogatari, Shinpen Zenshi, vol. 12, p. 48) 


In (36) below, there is no previous mention of bridges in relation to the place called 
Yatsuhashi. They are newly introduced and are unlinked to any definite discourse refer- 
ent. 


(36) Mikapa no kuni yatupasi to ipu tokoro ni itari-nu 
Mikawa GEN country Yatsuhashi comp say place ` DAT arrive-PFV 
soko wo  yatupasi to ipi-keru pa  midu yuku kapa 
this.place Acc Yatsuhashi COMP say-MPST TOP water go river 
no | kumo-de nareba pasi wo ya-tu watas-eru ni 
GEN spider-hand cop.because bridge acc eight-thing cross-STAT DAT 
yorite namu yatupasi to ipi-keru. 


depending Foc Yatsuhashi COMP  say-MPST 


‘[They] came to a place called Yatsuhasi. As for its being called Yatsuhashi, it was 
due to the fact that [they] spanned eight bridges over it, because the river of 


196 


7 A diachronic perspective on DOM in pre-modern Japanese 


water divided into spider legs, that [they] called it “Yatuhashi”. (Ise monogatari, 
Shinpen Zenshi, vol. 12, p. 120) 


In (37) below, the main character is depicted as doing something unexpected and mar- 
velous: releasing fireflies into a woman's bedchamber. Both the fireflies and the cloth 
panel he used to conceal them are newly introduced into the scene and have no links to 
a definite discourse referent. 


(37) yori-tamawite mikityau no  katabira wo  pito-pe 
depend-RESP.GER  standing.blind GEN panel Acc one-layer 
uti-kake-tamau ni awasete ` sa.to pikaru mono ga 


PFX-hang-REsP DAT matching suddenly glow thing GEN 


*...and just as [Otodo], drawing near, draped a panel from a standing blind (over 
the crossbeam), suddenly something glowing ..? (Genji monogatari: “Hotaru, 
Shinpen Zenshi, vol. 22, p. 200) 


Non-specific expressions of this form are unattested in OJ and violate the condition 
on DOM (7), indicating that the OJ system of DOM is no longer operative in EMJ. 


3.2 Specificity of object NPs containing WH-words in EMJ 


Given the fact that the EMJ sub-corpus does not include mark-up of constituents larger 
than the unit ^word', and has made no provision for the annotation of grammatical role, 
itis impossible to mechanically identify object NPs in general, including those associated 
with FQs (as above) and those containing wh-words (as below). Rather, attestations of 
the distinguishing part of speech have to be examined individually to determine their 
syntactic position and to determine the grammatical roles of the constituents which they 
distinguish. Given the difficulties of working with the EMJ sub-corpus, for this study we 
restricted our search to just variants of two wh-words for comparison with OJ: ta, tare 
‘who’ and na, nani ‘what’. It will be recalled that the wh-words in OJ which figure in the 
formation of wo-marked object wh-NPs are the following: ika ‘how’; ta, tare ‘who’; na, 
nani ‘what’; idure ‘which’. 


3.2.1 WH-word tare ‘who’ 


A search of the sub-corpus for object NPs containing wh-word ta, tare ‘who’ yielded 
results from texts produced between 900 CE (Taketori monogatari) and 1110 CE (Sanuki 
no suke nikki). We found 553 NPs containing the wh-word tare, ta ‘who’. Of those, 21 are 
grammatical objects. Of the 21 grammatical objects, 18 are accusative marked. Again, if 
the OJ system of DOM were to persist in EMJ, we would expect all 18 accusative marked 
examples to be specific. However, of these 18, 7 have question focus, and thus would have 
a non-specific interpretation under normal, unmarked circumstances. Upon inspection, 
we find no evidence to indicate that the reference for these is indeed anything but non- 
specific. For example, in the question in (38), there is a background assumption that no 
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one is supposed to know the things that the addressee speaks about as a matter of course. 
Accordingly it is extremely unlikely that there is assumed in the question a definite set 
of people from whom the addressee might learn such things. 


(38) tare ga osiwe o kikite pito no nabete siru 
who GEN teaching Acc hearing people GEN  liningup know 
beu mo ara-nu koto  o-ba iu zo 


should Erop exist-NEG word ACC-TOP say FOC 


‘Having heard whose teachings is it that [you] say these things which people 
invariably aren't supposed to know?’ (Makura no soshi, 131, Shinpen Zenshü, 
vol. 18, p. 248) 


In (39) below, the combination of question particle ka and topic particle wa form a 
rhetorical question: there is no expectation of a concrete answer, so the reference of tare 
is arguably non-specific. 


(39) ima wa  katazikenaku mo tare o ka wa  yoru-be ni 
now TOP regrettably ETOP who acc roc TOP depend-place cop 
omowi-kikoe-tamawa-n 
think-RESP-REsP-will 
“From here on — and [I] am terribly sorry to be saying this, but — whom(ever) 
might [you] consider as a benefactor?” (Genji monogatari, "og, Shinpen 
Zenshü, vol. 23, p. 451) 


In (40)-(42), the questions focus on previously unintroduced third-person entities. 
There is no obvious source of any basis for D-linking. 


(40) aki.kaze ni patukari ga ne zo kou naru tare 
autumn.wind DAT firstgoose GEN cry Foc be.audible rxr who 
ga  tamaadusa wo  kakete ` Ft ramu 
GEN missive ACC hanging come-PFV EXT 


“The voices of the first geese can be heard on the autumn wind. Whose missives 
do [they] come bearing?’ (Kokin wakashi, Shinpen Zenshü, vol. 11, p. 101) 


(41) momidiba no  tirite ` tumor-eru wa ga yado ni fare wo 
redleaves GEN scatter pile.up-srar I GEN dwelling par who acc 
matu.musi ` kokora naku ramu 
await.insect around.here cry EXT 
‘In my dwelling on which autumn leaves, falling, have piled up — whom must 
the matsumushi be awaiting? — the matsumushi cries around here’ (Kokin 
wakashú, Shinpen Zenshū, vol. 11, p. 100) 


(42) puna.ko-domo no araarasiki kowe nite uraganasiku mo tooku 
boat.man-PL | GEN rough voice cop mournfully Erop from.afar 


kana to ki-ni-keru utau o kiku mama ni  puta-ri 
SFP COMP come-PFV-MPST singing Acc listen thus cop two-people 
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sasi-mukawite  naki-keri puna.bito mo tare o kou to 
direct-face cry-MPST boatman ETOP who Acc yearnfor COMP 
ka  oo-sima no ura  kanasi-geni kowe no  kikoyuru 


roc O-island GEN bay sad-appearing voice GEN be.audible 


‘Even as [they] heard the boatmen in their rough voices singing, “Heartlorn, 
[we] ve come so far!” the two faced each other and cried. So whom do the 
boatmen long for? Voices from O Island Bay sound so heartsick? (Genji 
monogatari, "Tamakazura', Shinpen Zenshü, vol. 22, p. 90) 


Finally in (43)-(44), there is no mention in the previous context of a definite superset 
of suitors out of which one specific suitor might be picked. It may be argued that the 
social context might delimit a definite set of candidates, so the claim of non-specificity 
for these two examples is not as strong as that for the previous five. 


(43) medetaki ya tare wo ka  tori-tamau to notamaweba 
fortunate sFP who Acc roc take-RESP COMP  say.when 
sa.daisyau.dono no  sakon.no.seusyau to ka 
Left.Major.Captain GEN  Minor.Captain COMP FOC 


‘As [he] said, “That’s fortunate. Whom is [she] receiving (as a groom)?” [she] 
replied, “(I am given to understand) that it is the son of the Major Captain of the 
Left, the Minor Captain” .." (Ochikubo monogatari, Shinpen Zenshü, vol. 17, p. 89) 


(44) omuko no seusyau tare wo tori-tamau zo to 
groom GEN Minor.Captain who Acc take-REsP FOC COMP 
towi-kereba sa.daisyau no  sakon.no.seusyau.dono to 
say-MPST.when  Left.MajorCaptain GEN  Left.Minor.Captain COMP 


“As the husband, Minor Captain Kurauto, asked, “Whom will [she] take (as a 
groom)?" [she] replied, “(Mother says) [it] is the son of the Major Captain of the 
Left, the Minor Captain of the Left;..? (Ochikubo monogatari, Shinpen Zenshü, 
vol. 17, p. 147) 


Again, non-specific expressions of this form are unattested in OJ and violate the con- 
dition on the OJ system of DOM, providing further evidence that the OJ system of DOM 
is no longer operative in EMJ. 


3.2.2 WH-word nani ‘what’ 


A search of the sub-corpus for object NPs containing wh-word na, nani ‘what’ yielded 
results from texts produced between 900 CE (Taketori monogatari) and 1110 CE (Sanuki 
no suke nikki). We found 825 NPs containing the wh-word na, nani ‘what’. Of those, 113 
are grammatical objects. Of the 113 grammatical objects, 39 are accusative marked. Of the 
39 wo-marked grammatical objects, 13 have question focus and are arguably non-specific 
in reference. For example, in (45) below the speaker is expressing dismay at not being 
summoned in time for a funeral. The underlying assumption in the question is that there 
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could only have been some unknown sort of prohibition preventing the addressee from 
sending an invitation. There is no mention of prohibitions in the previous context, nor 
does the speaker actually wait for an answer to the question, suggesting the absence of 
any presupposed superset related to nani no monoimi o ‘what manner of prohibition?’. 
Similarly, in the remaining examples, open-ended questions are asked: ‘what in heaven's 
name?'; ^whatever?' 


(45 ana kokoro u ya  rei-sama ni mi-pirake-tamai-tu ran 
Ah heart despondent app usual-way COP  see-open-RESP-PFV EXT 
o ima pito-tabi | mi-maira-se-zu nari-nuru kokoro usa 
ACC yet one-time see-HUM-CAUS-NEG become-Prv heart ` despondency 
o nani no  monoimi o site yobi-tamawa-zari-turu zo 
Acc what cop prohibition acc doing call-RESP-NEG-PFV FOC 


`... “Oh, how sad! In the face of the sadness of the fact that [we] will never again 
be able to see his honourable face with his eyes open, observing what prohibi- 
tions was it that [you] didn't call [me]?" .." (Sanuki no suke nikki, Shinpen 
Zenshü, vol. 26, p. 420-421) 


(46) moto no sina toki yo no oboe  uti-awi ^ yamu-goto naki 
original GEN class time age GEN lesson Prx-meet stop-fact lacking 
atari no uti-uti no X motenasi kewawi  okure-tara-mu wa 
spot GEN inside-inside GEN demeanour bearing be.late-stat-will TOP 
sarani mo iwa-zu nani o site iki-owi-ide-kyemu 
newly ETOP say-NEG what Acc doing live-grow-come.out-must.have 
to iu kawi naku oboyu besi 
COMp say point lacking be.thought.of ought 


* ... “There is nothing more to be said about those who, while coming from a 
venerable home where the original class and the repute of the world at large are 
in accord, nonetheless are lacking in the demeanor and bearing appropriate 
thereto. Doing what must it have been that [they] were raised, (I wonder)? 
[They] should be thought of as not worth mention” ..? (Genji monogatari, 
‘Hahakigi’, Shinpen Zenshü, vol. 20, p. 60) 


(47) nani wo site mi no  itadura ni oi-nu ramu  tosi 
what acc doing body GEN in.vain cop grow.old-erv EXT year 
no | omopa-mu koto zo yasasiki 
GEN think-wil content roc embarrassing 
"Doing what must it be that [my] body has grown old in vain? How shameful [to 
me], what the years must be thinking! (Kokin wakashü, Shinpen Zenshü, vol. 11, 
p.404) 


(48) tati-wi no kewawi  tawe-gata-geni okonau ito 
stand-sit GEN bearing withstand-hard-appearing undertake very 
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aware ni asa no kiri ni koto nara-nu yo o 
pitiful cop morning GEN mist DAT otherwise be-NEG world acc 
nani o musaboru mi no inori ni ka to kiki-tamau 
what Acc gobbleup body GEN prayer COP roc COMP listen-RESP 


"Standing up and sitting down in a manner that appeared unbearable, (the old 
man) carried out the rites in a way that was so truly pitiful, [he] listened (to the 
old man), thinking, "Given that this world is no different than morning mist, 
these are the prayers of an earthly body hoarding up what, (I wonder)?" ..? 
(Genji monogatari, 'Yügao', Shinpen Zenshü, vol. 20, p. 158) 


kono miko ni | mawosi-tamapi-si pourai no tama no eda 
this aristocrat DAT say-RESP-SPST Horai GEN jewel GEN branch 
wo  pito-tu no tokoro ayamata-zu motite opasimas-eri nani 


Acc one-thing GEN place  differ-NEG having come-star what 
wo  motite  to.kaku  mawosu beki 

ACC having thatthis say ought 

"Tel has brought the branch with the jewels of Horai that [you] spoke to this 
lord about, with not a point of difference [in it]. Having what (as grounds) am [I] 
supposed to tell [him] this and that (as excuses)?’ (Taketori monogatari, Shinpen 
Zenshü, vol. 12, p. 29) 


saru koto ni | wa nani no  irawe o ka  se-mu | nakanaka 
such thing DAT TOP what GEN reply acc roc do-wil awkward 
nara-mu 

be-will 

"With respect to such a thing, what reply am [I] to make? [It] will be awkward: 
(Makura no soshi, 131, Shinpen Zenshü, vol. 18, p. 248) 


kakerite mo nani wo ka tama no kite mo mi-mu 
flying ETOP what acc roc soul GEN coming ETOP see-will 
kara pa  ponopo to  nari-ni-si monowo 
shell rop ember as become-Prv-sesr  given.that 


'Even flying, what would [my] soul, coming here, see? Given that [her] remains 
are already turned to embers. (Kokin wakashü, Shinpen Zenshü, vol. 11, p. 418) 


ausaka no seki pa yoru koso mori-masare kurureba 

Osaka GEN checkpoint top night roc  guard-excel  grow.dark.when 
nani wo ware tanomu ramu 

what Acc I rely EXT 

‘It is at night that [they] guard the Osaka checkpoint more strongly. When the 
day ends, what shall I rely on?' (Heichü monogatari, Shinpen Zenshü, vol. 12, 
p.459) 
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(53) 


(54) 


(55) 


(56) 
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miya.no.omawe ni uti.no.otodo no maturi-tamaw-eri-keru 
empress DAT  Ministerofthe.Centre GEN  give-RESP-STAT-MPST 

o kore ni nani o kaka-masi | uwe.no.omawe ni wa 

ACC this DAT what Acc write-sByv emperor DAT TOP 

siki to iu pumi o namu  kaka-se-tamaw-eru 


chronicle coMP say text ACC FOC write-CAUS-RESP-STAT 


“On the occasion of the Minister of the Centre giving [them] to the Empress, (she 
said), “What shall [I] write on these? On the Emperor's part, [he] is writing texts 
called “Chronicles” ..? (Makura no soshi, 327, Shinpen Zenshü, vol. 18, p. 467) 


ka bakari ` kokoro.zasi | oroka nara-nu  pito.bito ni 
this.way RES resolve negligent be-NEG  person.person COP 
koso a mere kaguya.pime no ipaku nani bakari no 
FOC exist EXT  Shining.Princess GEN saying what RES GEN 
pukaki wo ka mi-mu to ipa-mu  isasaka no koto nari 
depth Acc roc see-wil comp say-will trifling GEN thing be 


^... It seems that [they] are people not lacking feeling to this degree.” The Shining 
Princess's reply: “[I] shall tell [you]: What degree of depth do [I] want to see? 


[It] is a mere trifling”..” (Taketori monogatari, Shinpen Zenshü, vol. 12, p. 23) 


aware-gari medurasi-garite kaweru ni nani o ka 
impression-exhibit rareness-exhibiting return DAT what Acc FOC 
tatematura-mu mamemamesiki mono wa masa nakari namu 
give-will practical thing TOP appropriateness lacking srP 


*... making many signs of delight and interest (in me), when it was time [for me] 
to go home (she said), “What shall [I] give to you? Something practical just 
won't do” ..? (Sarashina nikki, Shinpen Zenshi, vol. 26, p. 298) 


ware wa to omowi.agareru tiuzyau.no.kimi zo  kanete zo 
I TOP COMP presuming Chijo Foc from.before Foc 
miyuru nado koso  kagami no kage ni mo 
be.visible and.thelike Foc  mirrorcake GEN image DAT ETOP 
katarawi-paberi-ture watakusi no inori wa nani bakari no 
talk-HUM-PFV private cop prayer TOP what RES GEN 
koto o ka | nado kikoyu 


word acc roc and.thelike say 


‘(The one to speak was) Chüjo, who presumed (to herself that if anyone has 
something to wish for, then) surely myself! “[I] was saying to [your] image in 
the mirror-cake, ‘(your thousand-year image) appeared from earlier; and so on. 
As for prayers for myself, how much of a boon (could I possibly ask)?” [she] 
continued in this vein’ (Genji monogatari, ‘Hatsune’, Shinpen Zenshü, vol. 22, 
p.144) 
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(57) yoki kakemono wa  ari-nu bekeredo karugarusiku wa 
good wager TOP exist-pFv ought.however lightly TOP 
e-watasu maziki o nani o ka pa nado 
can-hand.over impossible given.that what acc Foc TOP and.thelike 
notamawa-suru  mi-kesiki ^ ikaga miyu ranee 
say-RESP PFX-visage how be.visible ExT 


‘However must the sight [of him] saying such things as “Though there ought to 
be a good wager, [I] can't be handing anything over too lightly, so what (shall I 
wager)?” have appeared (to others)?’ (Genji monogatari, 'Yadorigi', Shinpen 
Zenshü, vol. 24, p. 378) 


Our evidence for the non-specificity of these items is perforce negative in nature: there 
is no positive way to rule out the possibility of a D-linking relationship for any of the 
wh-NP objects in the examples above, and the strength of the grounds for our judgments 
of reference status varies for some of the examples we present here.? However, most of 
our judgments carry a high degree of confidence. Given that non-specific expressions of 
this form are unattested in OJ and violate the condition on DOM (7), the evidence shows 
that the OJ system of DOM is no longer operative in EMJ. 


4 Discussion and conclusion 


Like all other attested stages of Japanese, both OJ and EMJ have variable object marking. 
However, the results reported in this paper show clearly that EMJ does not share the OJ 
system of DOM in which a correlation between accusative case marking of objects and 
specificity is observed. As described through $2, we examined NPs in OJ which under 
normal (unmarked) conditions were predicted to be non-specific in reference, namely 
object NP hosts of FQs and wh-object NPs with Q-focus. The distribution of object NP 
hosts of FQs in OJ (Table 1) gives a good reflection of the more general situation with 
regard to specificity and wo-marking: 


Table 1: Object NP hosts of FQs in OJ. 


wo-marked  zero-marked 


specific 10 1 
non-specific 0 4 


In general, OJ wo-marked objects are specific (e.g. (1)), unspecific objects are bare (e.g. 
(2), and some specific objects are bare (e.g. (GIL but there are no wo-marked objects 
which are non-specific. This distribution is summarized in Table 4 further below. 


?For example, there are conceivably exclusion relationships available to the object wh-NPs in (51)-(52). 
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In EMJ, by contrast, the distribution of object NP hosts of FQs (see 83.1) is as sum- 
marized in Table 2.!° This reflects the general situation in EMJ, where both specific and 
nonspecific objects may be wo-marked or bare, as shown in 83, where we demonstrated 
that EMJ has ample attestation of non-specific wo-marked object NPs. 


Table 2: Object NP hosts of FQs in EMJ. 


wo-marked  zero-marked 


specific 5 


e 2 
non-specific 3 7 


In general, the values for specificity and those for wo- or zero-marking on objects 
are seen to cross-classify in EMJ. This distribution is summarized in Table 5 below. That 
pattern is not found in OJ and is in direct contrast to the system seen in OJ, which 
disallows wo-marked non-specific objects. 

Thus, this paper identifies a major grammatical difference between OJ and EMJ shown 
by the absence of non-specific wo-marked objects in OJ, but their presence in EMJ. We ob- 
serve a change from the OJ system with morphological expression (accusative marking 
on direct objects) of specificity in some contexts, to the EMJ system with no morphologi- 
cal expression (through case marking) of specificity, that is, to a system where specificity 
is determined exclusively by context or NP modification or by the semantics of the head 
noun (e.g. proper noun, relational noun, etc.). This is an important descriptive finding. 

This does not mean, of course, that EMJ does not have some form of rule governed 
DOM, but it does show that the OJ system of DOM, which takes part in expressing speci- 
ficity, is not found in EMJ. For EM], the variability in case marking must be investigated 
throughout the large amount of available data in order to identify a system which gov- 
erns the observable variable case marking of objects. 

Now, in this paper we have not addressed the — important - issue of specific object NPs 
in OJ which are not wo-marked (3), and which therefore show that there is no simple one- 
to-one correlation between specificity and wo-marking on objects in OJ. In Frellesvig et 
al. (2015) we discuss this briefly and outline some of the hypotheses which have been or 
may be proposed for absence of accusative case marking on some specific objects, includ- 
ing conditions which may be formulated in terms of clause types (e.g. main (disfavoring 
wo-marking), embedded, relative, nominalized (favoring wo-marking)), or other factors 
which may play a role, such as phonological form, or lexical idiosyncrasy (of both verbs 
and nouns). While a number of tendencies and individual factors may be identified, it 
remains clear that no strong condition or set of conditions for the absence of accusative 
case marking on some specific objects in OJ has been established yet. 


Note that Table 2 does not break down the bare objects into specific and non-specific. As the point of inter- 
est for the comparison with OJ was the reference of wo-marked objects, we did not classify and quantify 
the reference of the bare objects. But as we already demonstrated by examples (33) and (34), the category 
of bare objects in EMJ contains both specific and non-specific NPs. 
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Much work remains to be done on this for OJ, empirically involving careful scrutiny of 
the more than 2,000 bare objects in the OJ corpus. An important part ofthe interpretation 
of the data will be to consider whether the distribution observed in OJ, summarized in 
Table 4, represents a stable system with (combinations of) conditions for absence of 
accusative case marking on specific objects, which so far has proven too complex to be 
described; or whether in fact the distributional facts of OJ in Table 4 reflect a system 
in transition, from a stable, simple pre-OJ DOM system with straightforward rules for 
expression of the specificity of direct objects, such as that hypothesized in Table 3, to the 
system of variable object marking we see in EMJ, summarized in Table 5, which takes 
no part in the expression of specificity. 


Table 3: Possible system of case marking and specificity of objects in pre-OJ. 


wo-marked  zero-marked 


specific + - 
non-specific - + 


Table 4: Accusative case marking and specificity of objects in OJ. 


wo-marked zero-marked 


specific + + 
non-specific - 


Table 5: Accusative case marking and specificity of objects in EMJ. 


wo-marked zero-marked 


specific + 
non-specific + + 


This would mean that OJ represents a stage in the actualization of the change from 
a system like that in Table 3 (pre-OJ) to that in Table 5 (EMJ) and that in itself would 
provide a ready explanation for the fact that we observe variability in case marking 
of specific objects in OJ. Much further research will be needed to determine whether 
that is the case, and if so, what governed the progression of the actualization of this 
change. A clearer understanding of the factors bearing on variable object marking in 
post-OJ stages of Japanese would be of enormous help, but this too needs much fur- 
ther research. Determination and interpretation of markedness values in a wide range 
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of contexts will undoubtedly play an important role in investigating these questions (cf. 
Andersen 2001a,b). 


Abbreviations 
ACC accusative OPT optative 
COM comitative PASS passive 
comp complementizer PFV perfective 
COP copula PFX prefix 
DAT dative PL plural 
ETOP emphatic topic RES restrictive particle 
EXT extension RESP respect 
FOC focus particle SBJV subjunctive 
GEN genitive SFP sentence final particle 
GER gerund sPST simple past 
HUM humble STAT stative 
IMP imperative SUBJ subject 
MPST modal past TOP topic 


NEG negative 


The following abbreviations indicate sources: 
KK Kojiki Kayo; MYS Man yoshü; NSK Nihon Shoki. 
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Chapter 8 


Nominal and verbal parameters in the 
diachrony of differential object marking 
in Spanish 


Marco García García 


University of Cologne 


This paper deals with the influence that nominal and verbal parameters have on DOM in the 
diachrony of Spanish. Comparing selected corpus studies, I will focus first on the different 
nominal parameters that build up the animacy and referentiality scales, in particular on an- 
imacy and definiteness. In order to clarify how far DOM has diachronically evolved, special 
attention will be paid to inanimate objects, which can be viewed as the alleged endpoint in 
the development of DOM in Spanish. Secondly, I will provide a systematic overview of the 
relevant verbal parameters, which include aspect, affectedness and agentivity. The study 
will show a complex interaction of nominal and verbal parameters, revealing some unex- 
pected correlations: Obligatory object marking is not only found with human and strongly 
affected objects involved in a telic event, but also with inanimate, non-affected and agen- 
tive objects embedded in a stative event. In other words, in Spanish DOM patterns with 
both extremely high and extremely low transitivity. These findings sharply contrast with 
traditional accounts concerning the development as well as the explanation of DOM. 


1 Introduction 


Differential object marking (DOM) is a well-attested phenomenon within Romance lan- 
guages (for an overview see Bossong 1998: 218-230). While in some Romance languages, 
such as Catalan or Modern Portuguese, DOM is confined to a reduced number of con- 
texts, in others, such as Sardinian or Spanish, it is found in many more contexts. This 
paper will focus on Spanish, where DOM seems to have reached a greater stage of de- 
velopment than in any other Romance language. 

As in most of the other Romance languages, DOM in Spanish is signaled by a, which 
goes back to the Latin preposition ad ‘to’. From its beginnings as a preposition with an 
exclusively locative-directional meaning, this preposition was firstly grammaticalized 
into a marker for indirect objects, i.e. datives. However, even in early Hispano-Romance, 
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the a-marker was already regularly used not only with indirect objects, but also with 
certain direct objects, in particular with those showing typical dative properties such as 
strong personal pronouns referring to humans (cf. Pensado 1995: 184-185 and Company 
Company 2002b: 205). Since then, DOM is reported to have evolved gradually along both 
the definiteness and the animacy scales (cf. e.g. Aissen 2003: 470-471). DOM in Spanish 
is said to depend not only on nominal parameters such as animacy and definiteness, but 
also on certain verbal parameters such as telicity and affectedness (cf. Torrego Salcedo 
1999: 1784-1791 among others). Despite the vast literature, which mainly focuses on nom- 
inal parameters, there are still several core questions that remain open. To begin with, it 
is not clear which of the verbal parameters are the most important for the (diachronic) 
distribution of DOM in Spanish. Moreover, it is not obvious how verbal parameters such 
as telicity interact with nominal parameters such as animacy. Lastly, there are quite 
different views about how far DOM in Spanish has evolved. 

The main purpose of this paper is to give an overview of the current state of research 
dealing with these questions. Firstly, I will critically review and compare several corpus 
studies in order to clarify how far DOM in Spanish has actually developed. To this end, 
I will concentrate on nominal parameters such as animacy and definiteness. Particular 
attention will be paid to inanimate objects. These can be seen as the alleged endpoint 
in the evolution of DOM in Spanish. According to Company Company (20022: 147), at 
least Mexican Spanish is strongly heading towards a complete generalization of object 
marking not only for animates, but also for inanimate objects. This raises the question of 
whether Spanish is typologically shifting from a language with DOM to a language with- 
out DOM, i.e. to a grammatical system with a sort of a regular accusative case marking. 
Secondly, I will provide a systematic overview of the less well-studied verbal parame- 
ters associated with DOM, which include aspect (telicity and perfectivity), affectedness 
and agentivity. As far as agentivity is concerned, I will build on my previous analyses 
for Modern Spanish (García García 2014) and extend them to a diachronic perspective 
providing a test corpus study for the reversible verbs seguir “to follow” and preceder “to 
precede’ (cf. §4.3.2). 

The paper is organized as follows: §2 introduces the main conditions determining 
DOM in Modern Spanish as well as its description by means of the animacy scale and 
the definiteness scale. §3 explores the diachrony of DOM in Spanish along these scales 
on the basis of Laca (2006) and a number of other corpus studies. §4 focuses on the 
aforementioned verbal parameters (aspect, affectedness, agentivity) and elaborates on 
their complex interaction with nominal parameters. §5 summarizes and discusses the 
main findings. 


2 Prominence scales and diachronic DOM 


DOM in Spanish is reported to depend first of all on what Laca (2002; 2006) calls lo- 
cal factors, i.e. animacy, definiteness and referentiality. Besides this, the distribution of 
a-marking also seems to be influenced by what Laca (2006: 429-432; 454-462) labels 
global factors, i.e. different kinds of contextual conditions, such as topicality and certain 
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verbal parameters.! However, these are usually seen as additional, i.e. less important 
conditions, at least from a synchronic perspective. As for Standard Modern Spanish, it 
is generally assumed that DOM is confined to human or at least animate (non-human) 
referents (cf. e.g. Torrego Salcedo 1999: 1782). For definite human objects, a-marking is 
more or less obligatory (cf. (1a)), while for indefinite human objects there is more varia- 
tion. Generally, a-marking is required for indefinite human objects that are specific (cf. 
(1b)). Note, though, that a-marked direct objects need not be specific. This is shown in 
(1c), where the subjunctive mood of the verb in the relative clause signals that the direct 
object, i.e. una actriz ‘an actress’, is non-specific, regardless of whether it is marked by 
aor not (cf. Leonetti 2004: 82-86 for discussion). 


(1) a. Pepe ve *o/a la actriz. 
Pepe see[3sG] ø/to the actress 


“Pepe sees the actress. 


b. Pepe  busc-a *o/a una actriz que  habl-a arameo. 
Pepe look for-3sG oito an actress who  speak-3sG Aramaic 


“Pepe is looking for an actress who speaks Aramaic: 


c. Pepe  busc-a ø/a una actriz que hable arameo. 
Pepe look for-3sc @/to an actress who speak-3sc.sByv Aramaic 


“Pepe is looking for an actress who speaks Aramaic? (non-specific reading) 


As for animate non-human objects, a-marking is optional, even if the object is definite 
as in (2). With inanimate (definite) objects, a-marking is generally ungrammatical (cf. 


(3). 


(2 Pepe ve g/a la vaca. 
Pepe see[3sG] @/to the cow 


"Pepe sees the cow: 


(3) Pepe ve o/*a la película. 
Pepe see[3sG] ø/to the film 


“Pepe sees the film 


Fitting these overall generalizations, DOM in Spanish is usually described by means 
of the animacy scale (4), the definiteness scale (5) or a combination of these prominence 
scales (cf. Aissen 2003: 417-418, Laca 2006: 436). 


INote that Laca (2006) does not use the terms local and global in the typological sense of Silverstein (1976), 
also followed by Witzlack-Makarevich & Serzant (2018 [this volume]). Thus, her notions are not associated 
with the distinction between languages where differential object marking is local in the sense that it only 
depends on the semantic properties of the object (e.g. animacy), and languages where the marking is rather 
global, i.e. where it also depends on the properties of another co-argument such as the animacy of the 
subject. The question of whether DOM in Spanish is local or rather global in the sense of Silverstein (1976) 
is not explicitly addressed in this paper. See, however, 84.3 focussing on the relative agentivity of the 
subject with respect to the object, as well as García García (2014: 40—43, 76-81), which deals with the 
relative animacy of subject and object. 
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(4) Animacy scale: 
human > animate > inanimate 


(5) Definiteness scale: 
personal pronoun (pron.) > proper name (PN) > definite NP (def. NP) > indefinite 
specific NP (spec. NP) > non-specific NP (non-spec. NP) 


As is well known, these scales provide a rough means to capture not only language- 
specific generalizations, but also cross-linguistic tendencies about DOM and related phe- 
nomena (for a critical discussion see Bickel et al. 2015; Haspelmath 2014; Sinnemáki 2014, 
and Witzlack-Makarevich & Serzant 2018 [this volume]). Typically, the scales are con- 
ceived of as implicational hierarchies. Among other things, they make the implicational 
prediction that if object marking is required for definite NPs in a given language, it will 
also be used for all higher ranging categories of the definiteness scale, i.e. proper names 
and personal pronouns. Conversely, it is implicated that if object marking is ungrammat- 
ical for definite NPs, it is also ruled out for all the lower ranging categories, i.e. indefinite 
specific and indefinite non-specific NPs. 

Languages with DOM differ in at least two respects. Firstly, object marking may be 
sensitive to either one of the mentioned scales or to both of them (cf. Bossong 1998: 202 
among many others, for a different view see Sinnemáki 2014). For example, in Hebrew 
or Turkish, DOM seems to depend only on the definiteness scale whereas in Spanish or 
Romanian DOM hinges on both the definiteness and the animacy scale. Secondly, lan- 
guages contrast with respect to the transition point, i.e. the right-most category within 
the relevant scale(s) that requires obligatory object marking. In Hebrew, for instance, 
object marking is obligatory for all definite NPs but not for indefinite NPs. As an im- 
plication, object marking is also compulsory for all the higher ranging categories in the 
definiteness scale, namely proper names and personal pronouns. DOM in Turkish shows 
a very similar distribution. In contrast to Hebrew, however, Turkish also requires DOM 
for indefinite specific NPs (cf. Aissen 2003: 453-454 and references cited therein). 

Since in Spanish DOM depends on both the animacy and the definiteness scale, the 
interaction of these scales has to be taken into account. A very elegant way to represent 
this interaction has been proposed by von Heusinger & Kaiser (2005: 40), who use a cross- 
classification (cf. Table 1). This representation provides a clear though still simplified 
picture of the conditions under which the a-marking of the direct object in Modern 
Standard Spanish is obligatory (+), optional (+) and ungrammatical (-). 

The animacy and definiteness scales are taken to be relevant not only for the syn- 
chronic distribution of DOM, but also for its diachronic development. The diachronic 
expansion is claimed to proceed from the more prominent categories on the left/top of 
the scales to the less prominent ones to the right/bottom of these scales. The opposite 
holds true for the retraction of DOM in that it is supposed to affect the less prominent 
categories before the more prominent ones. This less well attested case seems to be ev- 
idenced by the diachronic development of DOM in Catalan (cf. Dalrymple & Nikolaeva 
2011: 212) and Portuguese (cf. Delille 1970). 
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Table 1: DOM in Standard Spanish (cf. von Heusinger & Kaiser 2005: 40) 


Definiteness pron. > PN > def. NP > spec. NP»  non-spec. 
— NP 
Animacy | 
human * + + + + 
di + 

animate 

© + - = = 
inanimate 


Thus, at an initial stage, object marking may be restricted to human pronouns. At a 
further stage, it may become regular also for the less prominent categories of one or both 
scales, i.e. animate pronouns, human proper names, animate definite NPs and so forth. As 
is sometimes suggested in the literature, this may ultimately lead to a full grammatical- 
ization of the differential object marker into a regular accusative case marker (cf. Aissen 
2003: 255). In this respect, Villar (1983: 191-196) has argued that Proto-Indo-European 
had a differential object marker which, in the historic Indo-European languages, devel- 
oped into an obligatory object case marker (for discussion see Bossong 1984). As has 
already been noted in the introduction and will be discussed with more detail in the 
next section, a similar development has also been claimed regarding Spanish. 


3 Nominal parameters and diachronic DOM in Spanish 


3.1 Diachronic corpus studies 


The historic development of DOM in Spanish has been analyzed in a number of studies 
focusing on the impact of different factors such as animacy and definiteness (cf. e.g. Com- 
pany Company 2002b, Laca 2002; Aissen 2003), topicality (cf. Melis 1995) or affectedness, 
i.e. the influence of certain verb classes (cf. von Heusinger 2008; von Heusinger & Kaiser 
2011). Recently, not only monotransitive but also ditransitive constructions have been 
systematically taken into account (cf. Ortiz Ciscomani 2005; 2011, von Heusinger 2018 
[this volume]). While most of the empirical studies are confined to human and animate 
objects, some of them deal exclusively with inanimate objects (cf. Company Company 
2002a, Barraza Carbajal 2003; 2008). The most detailed empirical investigation is pro- 
vided by Laca (2006), whose corpus findings will serve as a reference point in the follow- 
ing sections. Laca’s corpus analysis comprises data from the 12th to the 19th century. The 
data are taken from nine texts, i.e. between one and three text samples per century.? It 


?The corpus is composed of samples from the following texts: Poema de mio Cid (12th cent.); El Conde Lucanor 
(14th cent); La Celestina (15th cent.); Lazarillo de Tormes, Documentos lingüísticos de la Nueva España (16th 
century); Don Quijote (17th cent.); La comedia nueva, El sí de las niñas, Documentos lingüísticos de la Nueva 
España (18th cent.); El Periquillo sarniento, Pepita Jiménez (19th cent.). 
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goes without saying that, given this rather restricted empirical basis, one has to act with 
caution when interpreting the data. Whenever possible, her data will be complemented 
and compared with the findings from other empirical studies. In order to give a critical 
overview of what is known about the diachronic expansion of DOM in Spanish, I will 
first concentrate on the impact of nominal parameters, i.e. animacy and definiteness. To 
this end, I will focus on human objects (83.2), animate (non-human) objects (83.3) and 
inanimate objects (83.4). In a further step, I will discuss the role of verbal parameters, i.e. 
aspect, affectedness and agentivity (84.1-84.4). 


3.2 Human objects 


Following Laca (2006: 436-438), I will use the animacy scale in (4) as well as the some- 
what simplified definiteness scale given in (6).? 


(6) personal pronoun > proper name > definite NP > indefinite NP > bare noun 


The latter scale differs slightly from the hierarchy given in (5). Most importantly, it 
does not include the category of specificity but that of bare nouns. Whereas indefinite 
NPs may be specific or non-specific, bare nouns are always non-specific. As a conse- 
quence, (6) will not allow for systematic observations concerning correlations between 
specificity and DOM. 

On the basis of Laca's (2006) corpus results and the simplified definiteness scale in (6), 
Table 2 and Figure 1 show the diachrony for DOM with human objects. It is to be noted 
that neither in the figure nor in the table have personal pronouns been considered since 
with these categories object marking was already obligatory in Old Spanish. 


Table 2: Diachrony of DOM with human objects (adapted from Laca 2006: 442- 


443). 
XII XIV XV XVI XVII XVIII XIX 
Proper name 96% 100% 100% 95% 100% 86% 89% 
(25/26) (8/8) (35/35) (42/44) (65/65) (24/28) (24/27) 
Definite NP 36% 55% 58% 70% 86% 83% 96% 
(13/36) (36/66) (38/65) (85/122) (117/136) (44/53) (73/76) 
Indefinite NP 0% 6% 0% 12% 40% 63% 41% 
(0/6) (2/31) (0/11) (7/59) (21/53) (20/32) (12/29) 
Bare noun 0% 0% 17% 5% 3% 9% 6% 


(0/12) (0/7) (2/12) (2/40) (1/39) (2/22) (1/17) 


3In contrast to the more fine-grained distinctions proposed by Laca (2006: 439-443), the scale in (6) nei- 
ther includes the differentiation between NPs with and without lexical heads, nor the distinction between 
definite-like NPs with universal quantifiers (e.g. cada each) and indefinite-like NPs with existential quan- 
tifiers (e.g. algo ‘some’). Consequently, these categories have not been taken into account in Table 2 and 
Figure 1. For a discussion of these categories cf. Laca (2006: 437-439) and García García (2014: 82-87). 
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100% _—_ 
80% 
60% =% proper name 
@@ definite NP 
= indefinite NP 


40% 
añ» bare noun 


20% 


0% 7 e i T T T T T 
XII XIV XV XVI XVII XVIII XIX 


Figure 1: Diachrony of DOM with human objects (based on Laca 2006: 442- 
443) 


Table 2 and Figure 1 allow for a number of observations: Firstly, the expansion of DOM 
is roughly confined to definite and indefinite NPs. With definite NPs, the frequency of 
a-marked objects increases significantly and more or less continuously. Starting with 
36% of a-marked objects with definite NPs in the 12th century, we already find 58% in 
the 15th century, 86% in the 17th century and, finally, 96% in the 19th century. Thus, 
from being an optional marker for definite human objects in Old Spanish, essentially 
restricted to dislocated, i.e. topicalized NPs (cf. (7) vs. (8)), a-marking has become an 
almost obligatory requirement for any kind of definite human object in Modern Spanish, 
including non-topicalized NPs (cf. (9)). 


(7) En  brago-s tened-es mi-s fija-s tan blanc-a-s | como 
in  arm-PL hold-2PL  1SG.POSS-PL daughter-PL so  white-F-PL as 
el sol. 
the sun 


“In your arms you hold my daughters as white as the sun? (Cid 2333, apud Laca 
2006: 455) 


(8 a las sus fija-s en-brago las prend-ia 
to the 3poss-pL daughter-PL in-arm them  take-iPrv[ssc] 
“He took his daughters in his arms. (Cid 275, apud Laca 2006: 428) 


(9) En brazo-s ten-éis a mi-s hija-s tan blanc-a-s como 
in arm-PL hold-2PL to isc.Poss-PL daughter-PL so white-r-PL as 
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el sol. 
the sun 


“In your arms you hold my daughters as white as the sun? 


As illustrated by these examples, one of the driving forces for the spread of DOM 
seems to be topicality. However, since topics are typically human and necessarily refer- 
ential, it is not clear whether topicality is also relevant for the spread of DOM concerning 
other subsets of direct objects, such as those expressed by human indefinite NPs. For a 
discussion on the impact of topicality on (diachronic) DOM in Spanish, see Laca (1995: 
85-89; 2006: 455-456); Melis (1995: 134, 161); Pensado (1995: 196-225); Delbecque (2002: 
85); Leonetti (2004: 86-107); von Heusinger & Kaiser (2005: 41-45), and Iemmolo (n.d.: 
ch. 8.5.2). 

As already mentioned above, Table 2 and Figure 1 also show a remarkable evolution 
with respect to human objects expressed by indefinite NPs. Contrary to definite NPs, 
however, we do not observe a continuous but rather a discontinuous development with 
indefinite NPs. From the 12th to the 16th century, a-marking of indefinites is attested 
very scarcely, showing no relevant tokens in the 12th and 15th century and merely 6% 
and 12% of a-marked NPs in the 14th and 16th century, respectively. In the 17th century, 
there is an abrupt rise of a-marked NPs up to 40% followed by a peak of 63% case-marked 
indefinites in the 18th century. Interestingly, case marking in this century is clearly more 
frequent than in the 19th century, where it is attested in merely 41% of the transitive 
constructions, i.e. just as often as 200 years before. As noted by Laca (2006: 460), the 
relatively high percentage of a-marking in the 18th century seems to be due to a verbal 
factor, namely to the disproportionately high number of causative constructions that are 
attested in the corresponding text samples. I will comment on this observation in 84.4. 

Comparing the development of human definite and indefinite objects, Table 2 and 
Figure 1 allow for a second general observation: During the whole period, the frequency 
of marked definite objects is clearly and constantly higher than that of indefinite ob- 
jects. This distribution is completely in line with the expected development based on the 
prominence scales. 

A further observation that follows from Table 2 and Figure 1 is that with both proper 
names and bare nouns, there is no attested evolution: similarly to strong personal pro- 
nouns, proper names already required object marking in the 12th century (cf. (10) as well 
as the findings from Company Company 2002b: 207 given in Table 5). Although Figure 1 
shows a slight retraction in the 18th and 19th century, it is still the strongly preferred 
option today. 


(10) Mat-astes a Bucar e arranc-amos el  canpo. 
kill-2sc.pst to Bucar and take-1pi.pst the field 


“You killed Bucar and we have won the battle? (Cid 2458, apud Laca 2006: 447) 


With bare nouns, object marking is hardly ever attested across the centuries. Note 
that the absolute numbers are extremely low with respect to this category showing only 
two or fewer tokens with a-marked objects per century. This is also the case for the 15th 
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century, where the relatively high frequency of 177; of DOM corresponds to only 2 out of 
12 relevant instances. Even in Modern Spanish, DOM of bare nouns is generally blocked. 
It is only found under certain conditions: (i) with bare plural objects governed by some 
verbs such as golpear ‘to beat’ (cf. example (16) in 84.2); (ii) with bare plural objects that 
are modified by an attribute as in (11); and (iii) with bare plurals expressing a contrastive 
focus as in (12). 


(11) a. “Detuv-ieron a  hincha-s. 
arrest.PST-3PL to  supporter-PL 


"Ihey arrested some supporters: 


b. Detuv-ieron a  hincha-s peligros-o-s del Atlético. 
arrest.PST-3PL to  supporter-PL dangerous-M-PL ofthe Atlético 


"Ihey arrested some dangerous Atlético supporters: (Leonetti 2004: 87) 


(12) a. "En el poblado vi a  pescador-es. 
in the village — see.PsrisG to  fisher-PL 


‘In the village I saw some fishers’ 


b. En el poblado wi a  PESCADOR-ES, no a turista-s. 
in the village  see.srisG to fisher-PL NEG to tourist-PL 


‘I saw fishers in the village, not tourists.’ (Leonetti 2004: 88) 


By way of summary, it is important to stress a fact that has not received the necessary 
attention in the literature: The expansion of DOM within the domain of humans only ap- 
plies to definite and indefinite object NPs. For the other NP types, there is no observable 
evolution. DOM was either already required in Old Spanish, as is the case with proper 
names, or it was and still is blocked today, as is evidenced by bare nouns. 


3.3 Animate non-human objects 


Let us turn to animate objects that do not refer to human individuals such as animals. 
Table 3 summarizes the corresponding corpus results from Laca (2006). Due to the many 
gaps and the very low numbers of relevant tokens across all categories, no clear picture 
emerges from these findings. 

With regard to proper names, indefinite NPs and bare nouns, no conclusions what- 
soever can be drawn on the basis of these numbers. The results are slightly better for 
definite NPs. Here, one may assume a certain increase of DOM: Whereas in the 14th cen- 
tury only 10% of the definite NPs occur with a-marking, we find 41% of marked objects 
in the 17th century and 36% in the 19th century. Note, however, that there are no cases 
of DOM in the 16th century and that there is a remarkable retraction in the 18th century, 
where, in contrast to the preceding centuries, only 6% of a-marked objects are attested. 

The results from another diachronic corpus analysis, namely that by Company Com- 
pany (2002a,b), suggest a much clearer picture. However, the overall distribution of a- 
marked animate objects is considerably lower, showing 3% of a-marked animate objects 
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Table 3: Diachrony of DOM with animate non-human objects (adapted from 
Laca 2006: 442-443) 


XII XIV XV XVI XVII XVIII XIX 


Propername 1007 - zx E 100% SS SS 
(1/1) (0/0) (0/0) (0/0) (10/10) (0/0) (0/0) 


Definite NP 0% 10% 20% 0% 41% 6% 36% 
(0/2) (2/20) (1/5) (0/10) (16/39) (118) (4/11) 

Indefinite NP - 0% - 0% 7% 4% 0% 
(0/0) (0/10) (0/0) (0/4) (1/15) (1/27) (0/5) 

Bare noun = 0% = 0% 0% 0% 0% 


(0/0) (0/5) (0/0) (0/11) (0/5) (0/6) (0/5) 


in the 13th and 14th century, 6% in the 15th century, and 7% in the 16th century (cf. Table 5 
in §3.4, below). These percentages may indicate a slight and constant increase of DOM, 
but one has to be cautious. Firstly, because the category of animates has not been dif- 
ferentiated with respect to definiteness in the aforementioned corpus study. This means 
that the frequencies within the same study may not be comparable. While the attested 
cases of DOM in the 13th century may contain animate indefinite NPs, the correspond- 
ing data of the 16th century may be confined to animate definite NPs or proper names. 
Secondly, Company Company’s (2002b) study does not provide information about the 
distribution of DOM with animates beyond the 16th century. Thus, in contrast to the 
development of a-marking with human objects, the diachrony of DOM with animates is 
far from clear. 

On the basis of the corpus studies carried out so far, we cannot assess whether there 
really has been an evolution of DOM with animate non-human objects. We clearly need 
further analyses grounded on much broader empirical bases. Moreover, there are some 
additional parameters that must be taken into account with respect to animate non- 
human objects, especially with regard to the category of animals. Beyond definiteness 
and other related semanto-pragmatic criteria such as specificity and topicality, DOM 
with animals additionally seems to depend on the species of the animal denoted by the 
lexical noun as well as on the affective relation between the speaker and the animal ref- 
erent in question (cf. Bossong 1991: 159; Aissen 2003: 457; Real Academia Española 2010: 
2635). Furthermore, a-marking also hinges on the agentivity of the animal referent in the 
given event: Based on data from Don Quijote (17th century), Garcia (1993: 42) observes 
that a-marking of definite animal objects is more likely in contexts where the animals 
are moving and acting on their own than in contexts where no movement of the animals 
is asserted. These parameters may be responsible for a great amount of both synchronic 
and diachronic variation. 

Summing up the results presented so far, it can be concluded that there has been a clear 
evolution of DOM along the definiteness scale. However, the evolution only concerns 
human referents, specifically human objects expressed by full definite and indefinite NPs. 
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While in Old Spanish the a-marking was optional (+) for human definite objects and was 
not attested for human indefinite NPs (-), in Modern Spanish we find near obligatory a- 
marking of the former (+) and at least optional a-marking (+) of the latter category (cf. 
Table 4). 


Table 4: Evolution of DOM with human objects along the definiteness scale 


[+human] Old Spanish Modern Spanish evolution 
(12th century) (19th century) 

Personal pronoun + + no 

Proper name + + no 

Definite NP + (36%) + (96%) yes 

Indefinite NP - (0%) + (41%) yes 

Bare noun B - no 


3.4 Inanimate objects 


Let us consider the diachrony of DOM with inanimate objects. Interestingly, a-marking 
with inanimate objects is already found in Old Spanish, though it is only attested very 
scarcely (cf. 84.3.2 for some examples). Laca (2006) does not give any numbers concern- 
ing the development of DOM with inanimate objects. However, her conclusion with 
respect to this lexical subset of object NPs is fairly clear: “On the basis of the analyzed 
corpus, one cannot assume an increase of the frequency of occurrences of object marking 
with inanimates, the use of the object marker is always marginal in these cases" (Laca 
2006: 450, my translation).* 

In contrast, Company Company (2002a,b) comes to a different conclusion. Her cor- 
pus study considers DOM with humans, animates and inanimates from the 13th-20th 
century. The data from the 20th century are exclusively from Mexican Spanish. Based 
on this corpus study, the author observes that a-marking has not only become more fre- 
quent for animate objects, in particular for humans, but also for inanimate objects (cf. 
Table 5). 

As for the 20th century, the data shows 17% (64/363) of inanimate objects with a- 
marking. Although Company Company does not differentiate between definite and in- 
definite NPs, it is very likely that the a-marked inanimate objects are mostly definite 
(cf. Barraza Carbajal 2003: 28, 108, García García 2014: 38-39, 81-87). According to Com- 
pany Company (2002a,b), the corpus results clearly indicate that (Mexican) Spanish is 
heading towards a full grammaticalization of the differential object marker into a proper 
accusative case marker: 


^*Partiendo del corpus examinado, no puede hablarse de un aumento de las ocurrencias ante inanimados, 
antes bien, la marca en estos casos es siempre marginal" (Laca 2006: 450). 
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Table 5: The diachrony of DOM in Spanish according to Company Company 
(2002b: 207) 


XIII XIV XV XVI XX 
Pronoun 100% 100% 99% 99% 100% 
(53/53) (46/46) (67/68) (182/183) (55/55) 
PN 99% 99% 96% 88% 100% 
(124/125) (170/172) (129/134) (124/147) (32/32) 
Human 42% 35% 35% 50% 57% 
(243/574) (224/631) (181/518) (541/1086) (81/141) 
Animate 3% 3% 6% 7% - 
(4/155) (2/64) (2/34) (11/168) - 
Inanimate 1% 0% 3% 8% 17% 
(2/300) (1/300) (8/300) (54/641) (64/373) 


Nowadays, the last stage of the grammaticalization is going on; an interesting slow 
invasion of the a case-marker into the prototype inanimate zone is taking place, it 
is no more a classifier ‘personal a’, it is becoming a true case-marker, generalizing 
its meaning and syntactic distribution. (Company Company 2002b: 208) 


However, (Mexican) Spanish actually seems to be rather far from entering this last 
stage of grammaticalization. In addition to the above-mentioned findings from Laca 
(2006: 450), this is shown by a number of further empirical analyses (cf. Buyse 1998; 
Barraza Carbajal 2003; Tippets 2011; García García 2014). In what follows, I will briefly 
comment on these studies. 

Barraza Carbajal (2003) is a detailed diachronic corpus analysis confined to inanimate 
objects. The data are based on different text types (literary texts, newspapers, academic 
texts) from the 16th, 18th and 20th centuries. One half of the texts stem from Spain, the 
other half from Mexico. Similar to Company Company (2002a,b), the findings from Bar- 
raza Carbajal also suggest an increase of a-marking with inanimate objects. However, 
the increase is much lower, showing 2% (12/547) of a-marked instances in the 16th cen- 
tury, 3% (15/546) in the 18th century and only 5% (49/962) in the 20th century. 

Similar results for the 20th century are provided by Tippets (2011), a contrastive anal- 
ysis of DOM based on exclusively oral material from Buenos Aires, Madrid and Mexico 
City. At least as far as inanimate objects are concerned, the distribution of a-marking is 
notably higher in Buenos Aires but still comparably low in all three cities: Tippets (2011: 
113) found 8% (26/339) of a-marked instances in Buenos Aires, 5% (18/345) in Madrid, and 
5% (13/283) in Mexico City. Particularly the percentages for Madrid and Mexico resemble 
the above-mentioned results from Barraza Carbajal (2003). Altogether, the distribution 
of a-marking with inanimate objects across the three varieties considered by Tippets 
(2011) is 5.9% (57/967). 

Buyse's (1998) study is a synchronic corpus analysis that uses mainly written texts 
from 20th century European Spanish. Regarding inanimate objects, his corpus shows 
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only 3.2% (65/1,936) of marked instances. According to my own empirical research (Gar- 
cía García 2014: 71), the frequency of a-marked inanimate objects in the 20th century is 
even lower, namely 1.2% (573/48,231). My corpus analysis is based on the Base de Datos de 
Verbos, Alternancias de Diátesis y Esquemas Sintáctico-Semánticos del Español (ADESSE), 
an open source data base of 1.5 million words that pertain to written and oral texts stem- 
ming from Spain (80%) and Latin America (20%).? Figure 2 summarizes the results of 
DOM with inanimate objects obtained in the previously mentioned corpus studies. (DO 
refers to morphologically non-marked direct objects and a DO to a-marked direct ob- 
jects). 


100% 
80% 


60% 
Ea DO 


EDO 
40% 


20% 


0% + 


Barraza (2003) Buyse (1998) | Company (2002a, b) García (2014) Tippets (2011) 


Figure 2: Percentages of DOM with inanimate objects in different corpora (20th 
century) 


As can be observed in this figure, the percentages of inanimate objects with a-marking 
found in the cited studies range from 1.2% to 17.2%. Interestingly, the reasons for the 
differing results seem to be neither connected to the origin of the data (Spain, Mexico 
etc.), nor to the type ofthe data (oral vs. written), but rather to the notion of animacy. This 
category is usually taken for granted and not defined explicitly. Particularly important 
in this regard is the categorization of objects denoting collectives such as equipo ‘team’ 
or empresa ‘company’, which are more likely to occur with a-marking. Crucially, in some 
corpus studies such as in Barraza Carbajal (2003), collectives are classified as inanimates, 
whereas in others, such as my own (García García 2014), they are subsumed under the 
category of animates. This may be one of the causes for the differing results (cf. García 
García 2014: 72-75). In order not to blur the distinction between animates and inanimates, 
the most adequate treatment would be to put collectives in a separate class, or, as Ilja 


"For details see http://adesse.uvigo.es/index.php/. 


221 


Marco García García 


A. Serzant (p.c.) has suggested, to simply exclude them from the analysis of DOM. This 
would do justice to the problem that the animacy association of these nouns is context- 
dependent and not uniform. 

To summarize this section, it can be concluded that there is no clear support for an 
evolution of DOM with inanimate NPs. Although a-marking of inanimate objects seems 
to be attested already in Old Spanish, it is still very rare today. Thus, there is no evidence 
for the hypothesis that the differential object marker is becoming a non-differential ac- 
cusative case marker. On the contrary, the empirical findings discussed in this section 
suggest that the evolution of a-marking from Old to Modern Spanish is basically re- 
stricted to human definite and human indefinite objects. This may lead to the conclusion 
that the a-marker is basically “a marker of animate direct objects” (de Swart 2007: 132), 
or human direct objects, to be more precise. However, this is a somewhat problematic 
simplification since, in combination with certain verbs, a-marking may also be required 
for inanimate objects (cf. $4.3 below). 


4 Verbal parameters and diachronic DOM in Spanish 


In this section, I will consider different verbal parameters, elaborating on their interac- 
tion with nominal parameters and their influence on synchronic and diachronic DOM. 
I will first look at aspect, focusing on telicity ($4.1), then take into account the role of 
affectedness ($4.2), and, finally, point to the relevance of agentivity (84.3-84.4). 


4.1 Aspect 


According to Torrego Salcedo (1999: 1787-1790), aspect has a clear and systematic influ- 
ence on DOM in Modern Spanish. She states that direct objects governed by telic verbs, 
ie. by Vendler (1957) ACHIEVEMENT and ACCOMPLISHMENT verbs such as insultar “to in- 
sult’ and curar ‘to treat’, take the a-marker obligatorily, at least if the object referents 
are human. This is illustrated in (13). 


(13) Insult-aron "g/a un estudiante. 
insult-3PL.PST oito a student 


“They insulted a student? 


Even though the direct object in (13) is indefinite, a-marking is not optional but cate- 
gorical. Note, however, that the verbs considered by Torrego Salcedo are not only char- 
acterized by being telic, but also by two further non-aspectual properties: firstly, verbs 
such as insultar ‘to insult’, sobornar ‘to bribe’, curar ‘to treat’ and emborrachar ‘to make 
drunk’ involve an affected object (cf. $4.2). Secondly and more importantly, these verbs 
only accept object arguments that are human. Thus, the alleged lexicalization of the a- 
marker assumed for these verbs might not be tied to telicity but rather to their strong 
preference for human objects (cf. also von Heusinger 2008: 28-29). Further evidence for 
this view is provided by the fact that direct objects governed by typical telic predicates 
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with a strong preference for inanimate objects such as the ACHIEVEMENT verbs abrir ‘to 
open’ or cerrar “to close’ are systematically blocked for DOM.$ 


(14) Pepe  abr-e o/*a la puerta. 
Pepe open-3sc.pst oito the door 


‘Pepe opens the door: 


Torrego Salcedo (1999) also considers atelic verbs, i.e. Vendler’s (1957) ACTIVITIES (e.g. 
besar “to kiss’) and STATES (e.g. conocer ‘to know”). They seem to differ with respect 
to the transition point of DOM, ie. the right-most category within the relevant scales 
requiring object marking. Contrary to the above-mentioned telic predicates, with verbs 
denoting ACTIVITIES and STATES, a-marking of indefinite human objects is not obligatory 
but rather optional. According to Torrego Salcedo (1999: 1788-1789), object marking with 
ACTIVITY verbs may lead to a shift from an atelic to a telic interpretation. However, this is 
controversial. As convincingly argued by Delbecque (2002: 95-97), the telic reading does 
not depend on DOM. This is shown in (15), which clearly denotes a telic event, regardless 
of whether the object is a-marked or not. 


(15) Bes-aron g/a varios  ciclista-s en una hora. 
kiss-3PL.pst  o/to several cycliste! in one hour 


“They kissed several cyclists in one hour: 


From a diachronic perspective, the influence of aspect on DOM has been studied by 
Barraza Carbajal (2008). This study is confined to inanimate objects. Therefore, it allows 
for an animacy-independent evaluation ofthe impact of aspect. Besides telicity, her study 
also considers perfectivity, i.e. the proper aspectual parameter related to the viewpoint 
of an event (perfective vs. imperfective). As far as telicity is concerned, the results of 
Barraza Carbajal (2008: 343-346) show that a-marking through time does not correlate 
with telic verbs such as comprar “to buy’, but rather with atelic verbs such as conocer ‘to 
know' (cf. Table 6). 

In each of the considered time periods in Table 6, the percentages of a-marked objects 
are clearly higher with atelic than with telic verbs. This is particularly evident for the 
18th century, where 93% of the a-marked objects are governed by atelic verbs. Note that, 
in all centuries, there is also a clear correlation between atelic verbs and the absence of 
a-marking. For example, in the 15th-16th century we find that not only 75% of the cases 
with DOM are attested with atelic verbs, but also that 61% of the instances without DOM 
combine with atelic predicates. Though in all ofthe time periods the percentages of atelic 
verbs are always higher for objects with a-marking than for those without a-marking, 
it is striking that, in the 20th century, the difference is only minimal (72% vs. 70%). This 
suggests that, diachronically, the influence of atelic verbs has decreased. Nowadays, the 
frequency of atelic verbs with a-marked objects roughly corresponds to the frequency 


‘Note also that there are some verbs such as preceder ‘to precede’ and suceder ‘to follow” that require a- 
marking even when the object is inanimate. Clearly, these verbs denote atelic rather than telic events (cf. 


$4.3). 
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Table 6: Telicity and diachronic DOM with inanimate objects (Barraza Carbajal 


2008: 345) 
DO aDO 
atelic telic atelic telic 
XV-XVI 61% 39% 75% 25% 
(326/535) (209/535) (18/24) 25% 
XVIII 76% 24% 93% 7% 
(404/531) (127/531) (67/72) (5/72) 
XX 70% 30% 72% 28% 
(639/913) (274/913) (133/185) (52/185) 


of these verbs with objects without a-marking. The same applies for telic verbs (28% 
vs. 30%). Consequently, telicity itself does not seem to be a relevant factor for DOM in 
Modern Spanish, at least as far as inanimate objects are concerned (cf. Barraza Carbajal 
2008: 345). 

The results for perfectivity, that is, the criterion related to the viewpoint aspect, re- 
semble those for telicity. Barraza Carbajal's (2008: 346-348) data show that there is a 
slight diachronic preference for DOM in imperfective rather than in perfective events. 
For the 20th century, the corpus findings show that 79% (146/185) of the a-marked ob- 
jects co-occur with an imperfective verb form while only 21% (39/185) are attested with a 
perfective verb form. Similar to what is the case with telicity, the percentages for the con- 
structions without DOM are comparable: While 74% (676/913) of the sentences without 
a-marking denote an imperfective event, 26% (237/913) express a perfective event. 

To sum up, our brief discussion of aspect points to the following conclusions: Firstly, 
the alleged lexicalization of the a-marker found with certain telic verbs such as insul- 
tar “to insult may not be due to telicity but rather to the verb's restriction for human 
objects. Secondly, Barraza Carbajal's (2008) analysis of inanimate objects suggests that 
aspect in itself has only a minor influence on DOM in Spanish. Thirdly, it seems that this 
influence decreases through time. Finally, it is remarkable that (diachronic) DOM does 
not correlate with telic and perfective but with atelic and imperfective events, i.e. with 
verbal parameters indicating a low rather than a high degree of transitivity. This correla- 
tion seems to contradict the findings concerning the second important verbal parameter 
related to DOM, namely affectedness. 


4.2 Affectedness 


The relevance of affectedness for DOM in Spanish has been pointed to by Spitzer (1928), 
Pottier (1968) and Torrego Salcedo (1999), among others. Similarly to telicity, Torrego 
Salcedo (1999: 1791) notes that, in Modern Spanish, objects governed by verbs selecting 
an affected object such as golpear ‘to beat’ require a-marking even for human objects 
that are indefinite and non-specific. As (16) shows, even bare nouns require the a-marker, 
at least with the verb golpear “to beat’. 
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(16) Siempre golpe-an *e/a turistas. 
always  beat-3PL ø/to tourists 


“They always beat tourists. 


According to the literature, some of the verbs selecting an affected object such as 
castigar “to punish’, sobornar ‘to bribe’ or odiar ‘to hate’ seem to have lexicalized the 
object marker for all human objects (cf. Leonetti 2004: 84 among others). However, it is 
not clear whether this alleged lexicalization is really due to affectedness. Again, most of 
these verbs only accept human objects. Verbs that also allow for inanimate objects such 
as odiar “to hate' only require a-marking when the object is human. As stated by von 
Heusinger (2008: 9): “It rather seems that it is just the condition of being human that 
triggers (obligatory) DOM” Moreover, the assumption that verbs such as odiar ‘to hate’ 
select an affected object is not without problems. Usually, such predicates are analyzed as 
psychological verbs having an EXPERIENCER and a STIMULUS as their arguments, whereby 
neither the former nor the latter represents a properly affected participant. 

The diachronic impact of affectedness on DOM in Spanish has been systematically 
analyzed by von Heusinger (2008) and von Heusinger 8 Kaiser (2011). In the latter study, 
affectedness is defined as the “persistent change of an event participant” (von Heusinger 
& Kaiser 2011: 593). Moreover, affectedness is taken as a gradual notion that is specified 
by means of Tsunoda's (1985: 388) transitivity or affectedness scale, where different verb 
classes are ordered with respect to the degree of affectedness of the patient argument (cf. 
Table 7). 


Table 7: Affectedness scale of Tsunoda (1985: 388, first 5 classes) with Spanish 
verbs (von Heusinger & Kaiser 2011: 609) 


1 2 3 4 5 
Direct effect on patient Perception Pursuit Knowledge Feeling 
(=effective action) 

la 1b 2a 2b 

+result —result +attained -attained 

matar golpear ver ‘see’, escuchar buscar conocer querer 

‘kill’, herir, ‘hit’, tirar oir ‘hear’ ‘listen’, “search “know”, ‘like’, 

‘violate’ ‘shoot’ mirar for’, entender temer 
‘look at’ esperar “under- ‘fear’ 

“wait for’ stand’ 


The left-most class, i.e. EFFECTIVE ACTION, comprises prototypical transitive verbs 
such as kill or hit. This class can further be subdivided into two subclasses (1a and 1b), de- 
pending on whether the event denoted by the predicate has a direct result on the patient 
or not. Verbs from the EFFECTIVE ACTION class la such as kill are supposed to impose 
the highest degree of affectedness on the corresponding patient. The verb classes to the 
right imply a respectively lower degree of affectedness. 
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Focusing on the five verb classes given in the affectedness scale in Table 7, von Heu- 
singer & Kaiser (2011) carried out a diachronic corpus analysis considering 12 verbs, i.e. 2 
verbs per class, including the subclasses of the EFFECTIVE ACTION type. Their study com- 
prises 2,000 sentences from the 15th, 17th and 19th centuries extracted from the Corpus 
del Español and CORDE. While they only considered human NPs, they carefully differ- 
entiated between definite and indefinite NPs. They found clear significant correlations 
between verb classes and diachronic DOM with both definite and indefinite objects. Here, 
I will only consider the latter NP subtype, i.e. human indefinite objects, since the impact 
of verb classes on DOM is more obvious with these objects. The results are presented in 
Table 8 and Figure 3. 


Table 8: Percentages of a-marking of human indefinite direct objects for five 
verb classes (von Heusinger & Kaiser 2011: 611) 


15th cent. 17th cent. 19th cent. 


la + 1b EFFECTIVE ACTION: matar, herir, 18% 40% 79% 
golpear, tirar (9/51) (21/53) (46/58) 
2a +2b PERCEPTION: oír ver, escuchar, mirar 17% 71% 93% 
(1/6) (22/31) (27/29) 
3 Punsurr: buscar, esperar 11% 23% 41% 
(1/9) (8/35) (17/41) 
4 KNOWLEDGE: conocer, entender - 31% 67% 
(0/0) (5/16) (14/21) 
5 FEELING: querer, temer = 52% 75% 
(0/0) (11/21) (15/20) 


Von Heusinger & Kaiser’s (2011) findings show a great influence of verb classes on 
DOM through time. Furthermore, they suggest at least a partial correlation between 
diachronic DOM and affectedness. For example, there are clearly higher percentages 
of a-marked instances in each of the centuries for direct objects governed by verbs of 
the EFFECTIVE ACTION class (e.g. matar ‘to kill’, golpear ‘to hit’) than for direct objects 
combining with the Punsurr class (e.g. buscar “to search for’, esperar “to wait for’). 

However, as noted by von Heusinger & Kaiser (2011), the corpus results do not fully 
mirror the expectations based on Tsunoda’s (1985) affectedness scale. There are some 
interesting mismatches concerning the correlation between diachronic DOM and affect- 
edness. The most striking mismatch concerns the class of FEELING, which represents the 
lowest ranking class in the proposed affectedness scale (cf. Table 7). Contrary to expec- 
tation, this class showed a much greater affinity for object marking than the Pursurr 
or the KNOWLEDGE class. Taking a closer look at the FEELING class, von Heusinger & 
Kaiser (2011) found that the two selected verbs, i.e. querer ‘to like’ and temer ‘to fear’, 
behave very differently. While the first shows the expected lower preference for object 
marking, the latter demonstrates an unexpected strong preference for a-marking. The 
authors explain the unpredicted behavior of DOM with temer ‘to fear’ as follows: 
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Figure 3: Percentages of a-marking of human indefinite depending on verb 
classes and time (von Heusinger € Kaiser 2011: 611) 


[T]he direct object of ‘fear’ has more typical properties of a subject than a proto- 
typical object of ‘like’ (see Kirsner & Thompson 1976). This might be the cause of 
temer's high scores. This behaviour, however, has nothing to do with affectedness, 
but rather with the competition between the agentivity of the participants involved 
in the event. (von Heusinger & Kaiser 2011: 613) 


A similar contrast as the one between querer ‘to like’ and temer ‘to fear” is found 
within the PERCEPTION class. Here, the verbs of auditory perception, i.e. escuchar 'to lis- 
ten’ and oír ‘to hear’ show a notably stronger preference for diachronic DOM than the 
visual perception verbs mirar ‘to look at’ and ver “to see’ (cf. von Heusinger & Kaiser 
2011: 614). The different behavior of these verbs can be explained along the same lines as 
the contrast between querer “to like’ and temer ‘to fear’. While the verbs of auditory per- 
ception presuppose a noise-producing source as their object argument, i.e. a physically 
active and thus agent-like participant, the object argument of visual perception verbs 
need not be an agentive participant (cf. also Enghels 2007: 244-273). 

Summing up, on the one hand there seems to be a clear diachronic correlation between 
affectedness and the spread of DOM. On the other hand, however, the unexpected strong 
preference for diachronic DOM found with the FEELING verb temer ‘to fear’, as well as 
with the verbs of auditory PERCEPTION escuchar ‘to listen’ and oir ‘to hear”, suggest a 
rather contrary correlation, namely that DOM is not favored by a higher degree of the 
object's affectedness but by a higher degree of the objects agentivity. As we will see 
in the next section, agentivity is also the key notion for understanding the rare and 
seemingly exceptional cases of DOM with inanimate objects. 
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4.3 Agentivity and DOM with inanimate objects 
4.3.1 DOM-sensitive verb classes in Modern Spanish 


As shown in $3.4, a-marking of inanimate direct objects is generally ungrammatical in 
Modern Spanish, cf. (3) repeated in (17) for convenience: 


(17) Pepe ve o/*a la película. 
Pepe see[3sG] ø/to the film 


“Pepe sees the film? 


However, in some cases, such as those given in (18), a-marking of inanimate objects 
is obligatory or at least the strongly preferred option. 


(18) a. Un artículo preced-e *ø/a un sustantivo. 
a article precede-3sG ø/to a noun 


‘An article precedes a noun’ 


b. En este cóctel el vodka pued-e sustitu-ir *ø/a la 
in this cocktail the vodka  can-3sc  substitute-INF ø/to the 
ginebra. 
gin 


‘In this cocktail, vodka can be substituted by gin: 


] ] ?? ] m 
c. La euforia caracteriz-a "g/a la situación. 
the euphoria characterize-3sc /to the situation 


‘Euphoria characterizes the situation’ 


d. La mujer  venc-ió "eVal destino. 
the woman beat-3sc.pst the/to.the destiny 


“The woman beat destiny? 


e. No  llam-an conflicto *e/a una pelea. 
NEG  call-3PL conflict ø/to a fight 


“They do not call a fight a conflict? 


Note that these examples challenge many of the standard assumptions about DOM. 
Firstly, they call into question the implicational predictions associated with prominence 
scales mentioned in 82: The observation based on (18), that (definite and indefinite) inani- 
mate objects must take the a-marker, would lead to the wrong prediction that a-marking 
is also obligatory for animate non-human objects.” Obviously, this is not the case. In 
most contexts, a-marking of animate non-human objects is rather optional than categor- 
ical (cf. Table 1). As noted by Torrego Salcedo (1999: 1788), among others, a-marking in 


"Though it is more usual to find definite rather than indefinite NPs among inanimate objects with a-marking 
(in particular with those that are not modified by an attribute), definiteness is not a necessary condition 
for a-marking (cf. (18a) and (18e)). 
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sentences such as those in (18) is not determined by nominal but by verbal factors, more 
specifically by lexical verbs such as preceder “to precede”. 

This conclusion is certainly true, but it involves a second problem. It contests the tra- 
ditionally assumed hierarchy of DOM conditions in Spanish, according to which object 
marking depends first and foremost on nominal parameters (animacy and definiteness) 
rather than on verbal parameters. 

The very impact of verbal parameters involves yet a third puzzle for the standard as- 
sumptions about DOM (in Spanish). The main verbal factors that are taken to be relevant 
for DOM in Spanish are telicity and affectedness (cf. $4.1 and 84.2). However, in (18) nei- 
ther the former nor the latter factors are at play. Apart from (18d), the sentences given in 
(18) do not denote a telic, but a stative situation. Furthermore, they involve a non-affected 
rather than an affected object. 

Following Weissenrieder (1985; 1991) and Delbecque (2002), I have argued elsewhere 
(cf. García García 2007: 65-66; 2014: 147-189) that DOM with inanimate objects occurs 
mainly with a small number of verb classes, namely with those given in (19). 


(19) DOM-sensitive verb classes 


a. Verbs of sequencing (e.g. preceder ‘to precede’, suceder ‘to succeed’). 

b. Verbs of replacement (e.g. sustituir ‘to substitute’, reemplazar ‘to replace”) 
c. Verbs of competition (e.g. vencer 'to win', derrotar 'to defeat") 

d. Verbs of attribution (e.g. caracterizar “to characterize’, definir ‘to define”) 


e. Verbs of naming (e.g. considerar ‘to consider”, llamar ‘to call’) 


The unexpected affinity for DOM with inanimate objects found with these verbs seems 
to be triggered by their specific role semantics, at least as far as the classes (18a-d) are 
concerned.? According to the generalization of thematic distinctness proposed in García 
García (2007: 71, 2014: 145); a-marking of inanimate direct objects is required when the 
subject does not outrank the object in terms of agentivity. Before illustrating this gener- 
alization, I will briefly specify my notion of agentivity, which is based on Primus’ (19992; 
2006) Proto-Role model, a refined version of that by Dowty (1991). 

Primus (1999a; 1999b; 2006) distinguishes two types of thematic information that de- 
fine Proto-Roles: involvement and dependency. Involvement is characterized by the num- 
ber and content of Proto-properties, which roughly correspond to those mentioned by 
Dowty (1991: 573), that is, control, (autonomous) movement, experience and possession. 
The second type of thematic information, viz. dependency, describes the causal rela- 
tion between the involved co-arguments. According to Primus (1999a: 52; 2006: 56), the 
PROTO-PATIENT always depends on the PROTO-AGENT (co-argument dependency). Cru- 
cially, the co-argument dependency relation is taken as the central criterion that dis- 
tinguishes the PROTO-AGENT from the PROTO-PATIENT. Whereas the PROTO-PATIENT is 
defined by its causal dependency on the PrRoTO-AGENT, the PROTO-AGENT is conceived 


5DOM with verbs of naming is mostly found in double object constructions, in particular when the object 
argument and the predicative nominal are adjacent, as in (18e). Thus, with this verb class DOM is rather 
due to syntactic factors (cf. García García 2014: 102-104). 
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of as a causally independent co-argument, i.e. as an argument whose existence and in- 
volvement in a given event do not depend on any other argument. 

Following Primus (2006), not just participants accumulating many or all of the Proto- 
Agent involvement properties (control, experience etc.), such as the first argument of 
Uma kills Bill, will count as PROTO-AGENTS. Participants showing a minimal number 
or even none of the corresponding involvement properties, such as the subject in Uma 
is brave, are also considered as PROTO-AGENTS, though as logically weaker ones.’ This 
is due to the fact that, in both situations, Uma functions as a causally independent co- 
argument. 

On the basis of Primus’ notion of agentivity, let me now illustrate the above-mentioned 
generalization of thematic distinctness. I will focus on the verbs of sequencing (19a) and 
the verbs of replacement (19b), which can be subsumed under the more abstract class 
of reversible predicates since they both point to a reversible relation between their co- 
arguments. Consider (18a), where the verb preceder 'to precede” denotes a merely tem- 
poral ordering of the core arguments artículo ‘article’ and sustantivo ‘noun’. According 
to Primus (2006: 56), both arguments can be categorized as PROTO-AGENTS. This follows 
from the fact that, in the sequencing event denoted by preceder “to precede”, none of the 
co-arguments depends on the other. Note that the same (truth-functional) meaning as 
in (18a) can be expressed by means of the verb suceder “to succeed/come after’, which is 
the converse counterpart of preceder 'to precede": 


(20) Un sustantivo  suced-e *o/a un artículo. 
a noun succeed-3sc  o/to a article 


“A noun comes after an article? 


As predicted by the generalization of thematic distinctness, a-marking is required in 
(18a), as well as in (20). Note that the a-marked NPs in (18a) and (20) are not indirect 
but direct objects. Though from a semantic point of view neither preceder ‘to precede’ 
nor suceder “to succeed' are typically transitive predicates, morphosyntactically they 
behave as canonical transitive verbs. This is evidenced by the fact that these verbs fulfill 
the standard morphosyntactic criteria for transitivity in Spanish. They allow for both 
pronominalization of the object by means of an accusative clitic and transformation into 
a passive (cf. García García 2014: 55-56). 

The obligatory object marking in (18b) can also be accounted for by thematic distinct- 
ness. Similar to (18a), (18b) also denotes a reversible relation between the correspond- 
ing co-arguments. Obviously, (18b) does not encode an asymmetric substitution event, 
with vodka and gin functioning as the respective PROTO-AGENT and PROTO-PATIENT ar- 
guments. Rather, vodka and gin are conceived of as replaceable ingredients. This means 
that (18b) neither entails a proper causation on the part of the subject, nor a proper af- 
fection on the part of the object argument. Again, both arguments can be analyzed as 
PROTO-AGENTS since none of the participants depends on the other. To put it differently, 


?PRorO-AGENTS having many or all of the corresponding involvement properties are specified as A™*, 
whereas PROTO-AGENTS with only a minimal or even none of the relevant involvement properties are 
referred to as A™ (cf. Primus 2006: 61). 
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in the referred situation vodka and gin serve the same role-semantic function: They can 
both be used to cause a specific change of state concerning the taste, the alcoholic con- 
tent or some other characteristic property of the cocktail in question (cf. García García 
2007: 80; 2014: 137-138, and Primus 2012: 78). 

Although reversible verbs generally show a very strong preference for a-marked di- 
rect objects, there are some conspicuous differences among the lexical predicates that 
form this class. AsI have shown in detail elsewhere (García García 2014: 162-167), this is 
particularly obvious with respect to the sequencing verbs preceder ‘to precede”, suceder 
“to succeed' and seguir “to follow”. In the corpus data base ADESSE (20th century), inani- 
mate direct objects of preceder and suceder are exclusively attested with a-marking. This 
suggests that these verbs have lexicalized the a-marker. However, in combination with 
seguir a-marking is only found in 7.5% (12/160) of the cases. The different behavior of 
preceder ‘to precede’ and suceder “to succeed’, on the one hand, and seguir “to follow”, on 
the other, is connected to the fact that the latter predicate is a polysemous verb. Seguir 
can be used not only with a reversible meaning in the sense of ‘x comes after y” (21a), 
but also with different non-reversible meanings such as “to follow (with the eyes), “to 
observe’ (21b) or ‘to continue” (21c). As illustrated in (21), a-marking is only found when 
seguir is used with the reversible meaning. 


(21) a. la-s pausa-s que  sig-uen L.] a sws tarea-s de 
the-PL pause-PL that follow-3Pr to  3SG.POSS-PL task-pL of 
copista 
copyist 
“the pauses that come after his tasks as a copyist' (ADESSE, PAI: 086, 02) 

b. el animal-it-o [.] segu-i-a cada movimiento de 
the animal-DIM-MASC follow-1pFv-3sG each movement of 
Su-s mano-s 


his-PL hand-PL 
‘the little animal followed/observed every movement of his hands’ (ADESSE, 


TER: 074, 16) 
c. te quitaban la chuleta y seg-ui-as el examen 
2sG.acc remove the crib and follow-Iprv-25G the exam 


‘they took the crib away from you and you continued the exam’ (ADESSE, 
MAD: 417, 05) 


Whereas (21a) denotes a situation similar to the ones expressed in (18a) and (20), i.e. a 
merely temporal relation in which the object is as agentive as the subject argument, both 
the event referred to in (21b) and in (21c) involve an object that is clearly less agentive 
than the respective subject participant. This correlates with the absence of a-marking. 

In sum, the observations on reversible predicates show that the relative agentivity of 
the direct object is a crucial factor for DOM, at least as far as inanimate objects in Modern 
Spanish are concerned (for further evidence, including the other DOM-sensitive verb 
classes mentioned in (19), see García García 2014: Ch. 6). Building on these synchronic 
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insights, let us now examine whether agentivity is also a diachronically relevant factor 
for DOM in Spanish. 


4.3.2 DOM- sensitive verb classes from a diachronic perspective 


Itis noteworthy that, despite its rareness, DOM with inanimate objects is already attested 
in older stages of Spanish, at least with definite NPs (cf. Table 5, Table 6 and (22)). As 
noted by Laca (2006: 451), it typically occurs with certain verbal lexemes such as those 
given in the examples from Fernando de Rojas” Celestina (1499) and Miguel de Cervantes” 
Don Quijote (1605, 1615) in (22). 
(22) a. que  preced-e a lo corporal 

that precede-3sG to the physical 

“that it precedes the physical things” (Celestina, VI. 178, apud Laca 2006: 451) 

b a los [.] clar-o-s sol-es, | nublad-o-s scur-0-S [...] 

to the bright-M-pL sun-PL, cloudy-m-PL  dark-w-Pr 

ve-mos  suced-er 

see-1PL follow-INF 


“we see that bright sunlight is followed by dark clouds” (Celestina, VIII. 215, 
apud Laca 2006: 451) 


c. La noche que  sigu-ió al día del rencuentro de 
the night that follow-3sc.psr to.the day ofthe reunion of 
la Muerte. 
the death 


‘The night that followed the day with the reunion with death’ (Quijote, 752, 
apud Laca 2006: 451) 

d. Y a ést-a-s llam-as señales de salud. 
and to this-r-PL call-25G signs of health 


“And you call those signs of health. (Celestina, VI. 178, apud Laca 2006: 451) 


e. la voluntad a la razón | no obedece 
the will to the reason NEG  obey-2sc 


‘will does not obey reason’ (Celestina, I. 9, apud Laca 2006: 452) 


Interestingly, most of these verbs correspond to the same verb classes that are also 
relevant for Modern Spanish: While the examples in (22a)-(22c) contain the sequencing 
verbs preceder ‘to precede”, suceder “to succeed’ and seguir ‘to follow”, (22d) shows a 
double object construction with the verb of naming llamar ‘to call’. Besides, verbs having 
a strong preference for (agent-like) human objects such as obedecer “to obey” (22e) also 
seem to allow for object marking with inanimates. In order to evaluate the diachronic 
influence of these verb classes and the impact of agentivity on DOM more thoroughly, 
further research is needed. 

As a first step towards this research task, I carried out a test corpus analysis for the 
sequencing verbs preceder “to precede” and seguir “to follow”. On the basis of the Corpus 
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del Español, 1 have checked data from the 13th to the 20th century. For each century, 1 
have analyzed the first 100 tokens with preceder and seguir, respectively. Data containing 
animate objects as well as cliticized objects were excluded. As a consequence, only about 
20 relevant tokens per verb and century could be evaluated. The results of the corpus 
analysis are shown in Table 9 and the simplified representation in Figure 4.1 


Table 9: Distribution of DOM with inanimate objects depending on preceder 
‘precede’ and seguir ‘follow’ (Corpus del Español) 


XIII XIV XV XVI XVII XVII XIX XX 
preceder 100% == 85% 77% 88% 92% 94% 98% 

(1/1) (11/13) (20/26) (7/8) (22/24) (29/31) (39/40) 
seguir 29% 6% 5% 10% 6% 19% 22% 13% 


(6/21) (1/17) (/22) (3/30) (2/34) (6/32) (4/18) (3/23) 


*""*preceder 


@@seguir 


Figure 4: Percentages of a-marking with inanimate objects depending on pre- 
ceder “to precede’ and seguir “to follow’ (Corpus del Español) 


Table 9 and Figure 4 allow for the following observations: Firstly, in combination with 
the sequencing verbs preceder ‘to precede’ and seguir ‘to follow’, a-marking of inanimate 
objects is already attested in the 13th century. Since then, the frequency of DOM with 
these verbs has remained quite stable. Note that although a-marking shows a minimal 


In contrast to Table 9, Figure 4 does not include the findings for the 14th century. In this century, only data 
with seguir ‘to follow’ but no relevant tokens with the verb preceder ‘to precede’ were found. 
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increase from the 18th century onwards, the highest percentages of DOM with both 
verbs are documented in the 13th century. This suggests that there have not been any 
significant changes, neither for DOM in combination with preceder ‘to precede’ nor with 
seguir “to follow”. Secondly, the verbs obviously have a very different affinity for DOM 
over time. While a-marking with preceder is highly frequent, ranging between 77% and 
100%, with seguir it is rather rare. With this verb, the percentages of inanimate objects 
with a-marking only range between 5% and 29%. 

A closer look at the data reveals that the different diachronic behavior of these verbs is 
due to the same role-semantic reasons as in Modern Spanish. The verb preceder is nearly 
exclusively documented with a reversible meaning in the sense of ‘x comes before y”, as 
in (23a). Only twice is it found within a non-reversible predication, as in (23b). Here, it is 
not restricted to the denotation of a mere sequencing event, but rather used in the sense 
of ‘to guide’ or “to determine’, thus expressing a causation between the subject and the 
object participant (cf. Delbecque 2002: 92-93 for similar meaning variations of preceder 
in Modern Spanish). 


(23) a. El matrimonio [..] preced-e alos otr-0-s sacramento-s. 
the marriage precede-3sc  to.the  other-w-PL  sacrament-PL 


“Marriage precedes the other sacraments. (13th century, Alf. X., Siete partidas) 


b. la certeza y seguiridad  [..] deb-e preced-er su 
the certainty and confidence must-3sc precede-INF  3sG.POss 
ejercicio 
practice 


'certainty and confidence must guide his practice' (16th century, Solórzano 
Pereira, Política indiana) 


Contrary to preceder, the verb seguir is only rarely attested with a reversible predica- 
tion in the sense of ‘x comes after y”, as in (24a). It is used much more frequently with a 
non-reversible meaning such as ‘to continue’, illustrated in (24b). 


(24) a. sigu-e ala primer-a faz de Aries 
follow-3sc to.the  first-F phase of Aries 
“it follows/comes after the first phase of Aries’ (13th century, Alf. X., Fudizios 
de las estrellas) 
b. non  quis-o ssegu-ir el pleito 
NEG want.psT-3sG follow-INF the lawsuit 


“he did not want to continue the lawsuit' (13th century, Alf. X., Espéculo) 


As shown in (23) and (24), inanimate objects of reversible relations are regularly 
marked with a, both in combination with preceder and seguir while those found in 
non-reversible predications, which are much more common with seguir, lack a-marking. 
These observations suggest that it is not the verb per se that triggers DOM through time 
but rather the agentivity of the direct object that follows from the more or less frequently 
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attested reversible meanings of the investigated verbs. This claim is supported by the syn- 
chronic distribution of DOM found with most of the other DOM-sensitive verb classes 
mentioned in (19). A case in point are the verbs of replacement sustituir “to substitute” 
and reemplazar “to replace”: Similar to seguir “to follow”, both sustituir and reemplazar 
have a reversible meaning (‘x takes the place of y”) and a non-reversible meaning (‘x 
substitutes/replaces y (with z)”), whereby the reversible variant patterns systematically 
with DOM and the non-reversible patterns with the absence of object marking (cf. Weis- 
senrieder 1985: 395-396; García García 2014: 149-154). However, so far, these verbs have 
only been examined in Modern Spanish. 

In order to obtain a more detailed picture of the diachronic impact of agentivity on 
DOM, the diachronic test corpus study undertaken for preceder “to precede” and seguir 
“to follow” must be complemented by empirical analyses considering all the other DOM- 
sensitive verb classes mentioned in (19), in particular by verbs of replacement (e.g. reem- 
plazar ‘to replace’), verbs of attribution (e.g. caracterizar ‘to characterize’) and verbs of 
competition (e.g. vencer ‘to win’). 


4.4 Accusativus-cum-infinitivo-constructions (Ach 


This section deals with AcI-constructions with causative and perception verbs. Thus, it 
does not consider a proper verbal but a constructional parameter. As we will see, AcI- 
constructions also seem to underpin the (diachronic) influence of agentivity on DOM. 
Let us reconsider the diachronic development of DOM with human indefinite objects re- 
ported in 83.2. As illustrated in Table 2 and Figure 1, the expansion of a-marking with this 
subset of objects shows a striking irregularity. While there are 40% (21/53) of a-marked 
objects in the 17th century and 41% (12/29) in the 19th century, the greatest percentage of 
a-marking with indefinite human objects is found in the 18th century, showing a remark- 
able peak of 63% (20/32). As noted by Laca (2006: 460), the relatively high percentage 
of a-marked objects found in this century is due to the disproportionately high number 
of causative constructions attested in one of the corresponding text samples, namely 
the Documentos lingüísticos de la Nueva España. In this text sample, 9 out of 12 of the a- 
marked indefinite human objects contain a causative construction such as the one given 
in (25). 


(25) hiz-o parec-er ante sí a un yndio que TL 
make.pst-3sG appear-INF before REFL to an Indian who 
dij-o llamarse Pedro Martín 
say.PST-3sG  to.be.called Pedro Martin 


“He summoned to him an Indian who said that he was called Pedro Martín. 
(DLNE, 1733, 189.487, apud Laca 2006: 460) 


The affinity of AcI-constructions for DOM is not only evidenced by constructions with 
causative verbs, but also by those with perception verbs. Although DOM is probably 
less frequent with the latter type of AcI-construction than with the causative type (cf. 
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Roegiest 2003: 316-317), it is still very common to also use the object marker in Acl- 
constructions with perception verbs, at least in Modern Spanish: 


(26) Se 0-yó maull-ar | a un gato. 
REFL  hear-3sG.PST meow-INF to a cat 


"We heard the meowing of a cat? (Corrales Egea, apud Roegiest 1979: 50) 


The question here is why AcI-constructions show such a striking preference for DOM. 
One can assume that this is due to agentivity, i.e. to the semi-agentive status of the ob- 
ject participant. As argued by Roegiest (1979: 50), the direct object of the matrix verb is 
concurrently the "subject" of the infinitival verb, whereby the latter relation involves an 
"activation" of the object, that is, an agentive interpretation of the corresponding par- 
ticipant. Within the Proto-Role model, it can be specified that the second participant 
of an AcI-construction shows both proto-agent and proto-patient properties (cf. Primus 
1999b: 161-162). This is particularly obvious with respect to (26). Whereas the first ar- 
gument of the perception event denoted by oír “to hear’ has the Proto-Agent property 
experience, the second argument, i.e. the indefinite non-human NP un gato “a cat’, is not 
only characterized by the converse Proto-Patient property of being experienced, i.e. of 
being perceived, but also by the Proto-Agent property move, entailed by the infinitival 
verb maullar 'to meow'. Note that the Proto-Agent property move is associated with any 
form of autonomous physical activity (cf. Primus 2006: 55). 

The close connection between the direct object's agentivity and DOM is also corrobo- 
rated by Enghels's (2007: 241-273) fine-grained study on AcI-constructions with percep- 
tion verbs in Modern Spanish (cf. also Torrego Salcedo 1999: 1792). Enghels differentiates 
between different factors that determine the agentivity degree of the direct object, i.e. of 
the second argument of an AclI-construction, such as (i) the modality of the perception 
verb (visual vs. auditory), (ii) the animacy of the second argument (human, animate, 
inanimate etc.) and (iii) the semantics of the infinitival verb (transitive, unergative, unac- 
cusative). With respect to the latter factor, it is assumed that AcI-constructions embed- 
ding predicates that are transitive, such as matar “to kill', presuppose a high agentivity 
degree of the second argument, while AcI-constructions embedding unergative verbs 
such as reír “to laugh’ and those having unaccusative verbs such as morir ‘to die’ imply 
a respectively lower agentivity degree of the second argument. Enghels' (2007: 241-273) 
findings reveal that the more the mentioned factors indicate an agentive interpretation of 
the direct object argument, the greater the probability for a-marking. Though the modal- 
ity of the perception verb (visual vs. auditory) and the animacy of the second argument 
are the most relevant factors, there is also a clear and independent effect with respect to 
the semantics of the infinitival verb (cf. Table 10). 

Table 10 represents the influence of the embedded infinitival predicate on DOM in AcI- 
constructions with human direct objects. As can be observed, a-marking is noticeably 
more frequent with transitive verbs (98.6%) than with intransitive verbs, especially in 
comparison with unaccusative verbs (71.1%), that is, with those predicates presupposing 
the lowest agentivity degree of the direct object. 
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Table 10: Distribution of DOM with human objects in AcI-constructions de- 
pending on the semantics of the infinitival predicate (adapted from Enghels 


2007: 268) 
infinitival predicate DO a DO 
transitive 1.4% (5/369) 98.6% (364/369) 
unergative 4.5% (17/308) 94.5% (291/308 
unaccusative 28.9% (123/425) 711% (302/425) 


5 Conclusion 


In Spanish, DOM is diachronically triggered not only by nominal, but also verbal param- 
eters. The general picture that emerges from the current research on nominal parameters 
(animacy and definiteness) is that DOM is a remarkably stable system. Although there 
has clearly been an evolution of DOM from Old to Modern Spanish, this development 
is basically restricted to human definite and indefinite objects (cf. Table 4). Other NP 
types do not seem to have undergone any remarkable changes. This applies in particular 
to the category of inanimates: The a-marking of inanimate direct objects was and still 
is a scarcely attested phenomenon (cf. Figure 2). Thus, there is no clear support for the 
hypothesis that the a-marker is grammaticalizing into a proper accusative case marker 
and, consequently, that Spanish is changing from a language with DOM to a language 
without DOM. Nevertheless, it would be wrong to conclude that DOM in Spanish is 
essentially driven by humanness. 

The discussion of verbal parameters has revealed that the occurrence of DOM through 
time is also influenced by agentivity, affectedness and, in some rather inconsistent way, 
also by aspect. As for agentivity, the test corpus analysis of preceder 'to precede' and 
seguir ‘to follow” (13th-20th century) has shown that agentive objects require a-marking 
even when the referent is inanimate. Thus, in both Modern and Old Spanish, agentivity 
overrides the strong DOM condition of humanness. Further evidence for the relevance of 
agentivity is provided by the unexpected preference for DOM with verbs such as temer 
‘to fear’ (cf. von Heusinger & Kaiser 2011: 613), as well as by AcI-constructions, which 
also show a clear preference for DOM, at least from the 18th century on. In these con- 
structions the direct object not only functions as a patient, but also as an agent argument. 

Note that the conclusion that DOM is diachronically conditioned by both the object's 
humanness and the object's agentivity is no contradiction. On the contrary, humanness 
can be taken as an inherent nominal feature that encodes a very typical, though not nec- 
essary, property of an agent. As pointed out by Delbecque (1998: 398) and Primus (2012: 
78-79), among others, human direct objects can be conceived of as potential agents. 

The interaction of nominal and verbal parameters, though, remains challenging. As 
has been shown, diachronic DOM also depends on affectedness and, to some extent, on 
telicity. However, these factors only seem to be relevant with respect to human objects. 
While there are some telic predicates involving a highly affected object that have lexical- 
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ized the a-marker, such as matar “to kill' and insultar “to insult', it must be emphasized 
that these verbs only accept human or at least animate objects. If we only consider inan- 
imate objects, telicity has a rather negative influence on diachronic DOM (cf. Table 6). 
Besides, we also find atelic verbs selecting a non-affected object such as preceder “to pre- 
cede’ and suceder ‘to succeed’ that seem to have lexicalized DOM, too. This leads to the 
puzzling conclusion that, in terms of Hopper & Thompson (1980), DOM in Spanish is 
driven by both extremely high and extremely low transitivity (cf. also Fábregas 2013: 67). 
Obligatory a-marking is not only found with human, strongly affected objects involved 
in a telic event, but also with inanimate, non-affected and agentive objects embedded in 
a stative event. 

In order to understand these contrary facts, more research on the interaction of nom- 
inal and verbal parameters is needed. In particular, systematic analyses of agentivity, 
affectedness and telicity that are independent of animacy are necessary. 


Acknowledgments 


I would like to thank Ilja A. SerZant, Javier Caro Reina and Klaus von Heusinger for their 
very useful comments on the entire manuscript. 


References 


Aissen, Judith. 2003. Differential object marking: Iconicity vs. Economy. Natural Lan- 
guage and Linguistic Theory 21(3). 435—483. 

Barraza Carbajal, Georgina. 2003. Evolución del objeto directo inanimado en español. Mex- 
ico City: Universidad Nacional Autónoma de México dissertation. 

Barraza Carbajal, Georgina. 2008. Marcación preposicional de objeto directo inanimado. 
In Actas del VII Congreso Internacional de Historia de la Lengua Española: Merida (Yu- 
catán), 4—8 septiembre de 2006, 341-352. Madrid: Arco Libros. 

Bickel, Balthasar, Alena Witzlack-Makarevich & Taras Zakharko. 2015. Typological ev- 
idence against universal effects of referential scales on case alignment. In Ina Born- 
kessel-Schlesewsky, Andrej L. Malchukov & Marc Richards (eds.), Scales, 7-43. Berlin: 
de Gruyter Mouton. 

Bossong, Georg. 1984. Review of Francisco Villar. Ergatividad, acusatividad y género 
en la familia lingüística indoeuropea. Theses et Studia Philologica Salmanticensia 21. 
Salamanca. 1938. Lingua 62(3). 239-247. 

Bossong, Georg. 1991. Differential object marking in Romance and beyond. In Dieter 
Wanner & Douglas A. Kibbee (eds.), New analyses in Romance linguistics. Selected pa- 
pers from the XVIII Linguistic Symposium on Romance Languages, Urbana-Champaign, 
April 7-9, 1988, 143-170. Amsterdam: John Benjamins. 

Bossong, Georg. 1998. Le marquage différentiel de l'objet dans les langues d'Europe. In 
Jack Feuillet (ed.), Actance et valence dans les langues de l'Europe, 193-258. Berlin: De 
Gruyter Mouton. 


238 


8 Nominal and verbal parameters in the diachrony of DOM in Spanish 


Buyse, Kris. 1998. The Spanish prepositional accusative. What grammars say versus what 
corpora tell us about it. Leuvense Bijdragen 87. 371-385. 

Company Company, Concepción. 2002a. El avance diacrónico de la marcación preposi- 
tiva en objetos directos inanimados. In Alberto Bernabé, José Antonio Berenguer, Mar- 
garita Cantarero & José Carlos de Torres (eds.), Presente y futuro de la lingüística en 
España. La Sociedad de Lingüística 30 años después, 146-154. Madrid: Consejo Superior 
de Investigaciones Científicas. 

Company Company, Concepción. 2002b. Grammaticalization and category weakness. In 
Ilse Wischer 8 Gabriele Diewald (eds.), New reflections on grammaticalization, 201-215. 
Amsterdam: John Benjamins. 

Dalrymple, Mary & Irina Nikolaeva. 2011. Objects and information structure. Cambridge: 
Cambridge University Press. 

de Swart, Peter. 2007. Cross-linguistic variation in object marking. Nijmegen: Radboud 
University Nijmegen dissertation. 

Delbecque, Nicole. 1998. Why Spanish has two transitive construction frames. Leuvense 
Bijdragen 87. 387-514. 

Delbecque, Nicole. 2002. A construction grammar approach to transitivity in Spanish. 
In Kristine Davidse & Béatrice Lamiroy (eds.), The nominative & accusative and their 
counterparts, 81-130. Amsterdam: John Benjamins. 

Delille, Karl Heinz. 1970. Die geschichtliche Entwicklung des prápositionalen Akkusativs 
im Portugiesischen. Bonn: Romanisches Seminar der Universitát Bonn. 

Dowty, David R. 1991. Thematic proto-roles and argument selection. Language 61(3). 547- 
619. 

Enghels, Renata. 2007. Les modalités de perception visuelle et auditive. Différences con- 
ceptuelles et répercussions sémantico-syntaxiques en espagnol et en fransais. Tübingen: 
Niemeyer. 

Fábregas, Antonio. 2013. Differential object marking in Spanish: State of the art. Borealis 
2. 1-80. 

García García, Marco. 2007. Differential object marking with inanimate objects. In Pro- 
ceedings of the workshop “Definiteness, specificity and animacy in Ibero-Romance lan- 
guages”, 63-84. Universitát Konstanz: Fachbereich Sprachwissenschaft (Arbeitspapier 
122). 

García García, Marco. 2014. Differentielle Objektmarkierung bei unbelebten Objekten im 
Spanischen. Berlin: De Gruyter Mouton. 

García, Érica C. 1993. Syntactic diffusion and the irreversibility of linguistic change: Per- 
sonal a in Old Spanish. In Jürgen Schmidt-Radefeldt & Andreas Harder (eds.), Sprach- 
wandel und Sprachgeschichte, Festschrift für Helmut Lüdtke zum 65. Geburtstag, 33-50. 
Tübingen: Narr. 

Haspelmath, Martin. 2014. Descriptive scales versus comparative scales. In Ina 
Bornkessel-Schlesewsky, Andrej L. Malchukov & Marc Richards (eds.), Scales and hi- 
erarchies. A cross-disciplinary perspective, 45-58. Berlin: de Gruyter Mouton. 

Hopper, Paul J. & Sandra A. Thompson. 1980. Transitivity in grammar and discourse. 
Language 56(2). 251-299. 


239 


Marco García García 


Iemmolo, Giorgio. n.d. Differential object marking. University of Zurich. 

Kirsner, Robert S. & Sandra A. Thompson. 1976. The role of pragmatic inference in se- 
mantics: A study of sensory verb complements in English. Glossa 10. 200-240. 

Laca, Brenda. 1995. Sobre el uso del acusativo preposicional en español. In C. Pensado 
(ed.), El complemento directo preposicional, 61-91. Madrid: Visor Libros. 

Laca, Brenda. 2002. Gramaticalización y variabilidad: Propiedades inherentes y fac- 
tores contextuales en la evolución del acusativo preposicional en español. In Andreas 
Wesch, Waltraud Weidenbusch, Rolf Kailuweit & Brenda Laca (eds.), Sprachgeschichte 
als Varietátengeschichte. Historia de las variedades lingúística, 195-203. Túbingen: 
Stauffenburg. 

Laca, Brenda. 2006. El objeto directo. La marcación preposicional. In Concepción Com- 
pany Company (ed.), Sintaxis historica del español. Primera parte: La frase verbal, vol. 1, 
423-475. Mexico: Universidad Nacional Autónoma de México. 

Leonetti, Manuel. 2004. Specificity and differential object marking in Spanish. Catalan 
Journal of Linguistics 3. 75-114. 

Melis, Chantal. 1995. El objeto directo personal en El Cantar de Mio Cid. Estudio 
sintáctico-pragmático. In Carmen Pensado (ed.), El complemento directo preposicional, 
133-163. Madrid: Visor Libros. 

Ortiz Ciscomani, Rosa María. 2005. Los objetos concurrentes y la bitransitividad en el 
español en perspectiva diacrónica. In David Eddington (ed.), Selected Proceedings of the 
7th Hispanic Linguistics Symposium, 192-202. Somerville, Massachusetts: Cascadilla 
Proceedings Project. 

Ortiz Ciscomani, Rosa María. 2011. Construcciones bitransitivas en la historia. México: 
Universidad Nacional Autónoma de México, Instituto de Investigaciones Filoloógicas. 

Pensado, Carmen. 1995. La creación del complemento directo preposicional y la flexión 
de los pronombres personales en las lenguas románicas. In Carmen Pensado (ed.), El 
complemento directo preposicional, 179-233. Madrid: Visor Libros. 

Pottier, Bernard. 1968. L'emploi de la préposition a devant l'objet en espagnol. Bulletin 
de la Société de Linguistique 1. 83-95. 

Primus, Beatrice. 1999a. Cases and thematic roles. Ergative, accusative and active. Tübin- 
gen: Niemeyer. 

Primus, Beatrice. 1999b. Rektionsprinzipien. In Heide Wegener (ed.), Deutsch — kontrastiv. 
Typologisch vergleichende Untersuchungen zur deutschen Grammatik, 135-170. Tübin- 
gen: Stauffenberg. 

Primus, Beatrice. 2006. Hierarchy mismatches and the dimensions of role semantics. In 
Ina Bornkessel, Matthias Schlesewsky, Bernard Comrie & Angela D. Friederici (eds.), 
Semantic role universals and argument linking. Theoretical, typological and psycholin- 
guistic perspectives, 53-88. Berlin: Mouton de Gruyter. 

Primus, Beatrice. 2012. Animacy, generalized semantic roles, and differential object mark- 
ing. In Monique Lamers & Peter de Swart (eds.), Case, word order and prominence: In- 
teracting cues in language production and comprehension, 65-90. Dordrecht: Springer. 

Real Academia Española. 2010. Nueva gramática de la lengua española. Madrid: Espasa- 
Calpe. 


240 


8 Nominal and verbal parameters in the diachrony of DOM in Spanish 


Roegiest, Eugeen. 1979. A propos de l'accusatif prépositionnel dans quelques langues 
romanes. Vox Romanica 38. 37-54. 

Roegiest, Eugeen. 2003. Argument structure of perception verbs and actance variation 
of the Spanish direct object. In Giuliana Fiorentino (ed.), Romance objects. Transitivity 
in Romance languages, 299—322. Berlin: De Gruyter Mouton. 

Silverstein, Michael. 1976. Hierarchy of features and ergativity. In R. M. W. Dixon (ed.), 
Grammatical categories in Australian languages, 112-171. Atlantic Highlands, NJ: Hu- 
manities Press. 

Sinnemáki, Kaius. 2014. A typological perspective on Differential Object Marking. Lin- 
guistics 52(2). 281-313. 

Spitzer, Leo. 1928. Rum. P(r)e, span. á vor persönlichem Akkusativobjekt. Zeitschrift für 
romanische Philologie 48. 423—432. 

Tippets, Ian. 2011. Differential object marking: Quantitative evidence for underlying hie- 
rarchical constraints across Spanish dialects. In Luis A. Ortiz-López (ed.), Selected 
proceedings of the 13th Hispanic Linguistics Symposium, 107-117. Somerville, MA: Cas- 
cadilla Proceedings Project. 

Torrego Salcedo, Esther. 1999. E] complemento directo preposicional. In Ignacio Bosque & 
Violeta Demonte (eds.), Gramática descriptiva de la lengua española. Las construcciones 
sintácticas fundamentales. Relaciones temporales, aceptuales y modales, vol. 2, 1779- 
1805. Madrid: Espasa Calpe. 

Tsunoda, Tasaku. 1985. Remarks on transitivity. Journal of Linguistics 21(2). 385-396. 

Vendler, Zeno. 1957. Verbs and times. The Philosophical Review 66(2). 143-160. 

Villar, Francisco. 1983. Ergatividad, acusatividad y género en la familia lingüística indoeu- 
rope. Salamanca: Ediciones Universidad de Salamanca. 

von Heusinger, Klaus. 2008. Verbal semantics and the diachronic development of DOM 
in Spanish. Probus 20. 1-31. 

von Heusinger, Klaus. 2018. The diachronic development of Differential Object Marking 
in Spanish ditransitive constructions. In Ilja A. Serzant & Alena Witzlack-Makarevich 
(eds.), The diachronic typology Ue Ma UN marking, 315-344. Berlin: Lan- 
guage Science Press. DOI:10.5/ 

von Heusinger, Klaus & Georg A. Kobe 2005. The evolution of differential object mark- 
ing in Spanish. In Klaus von Heusinger, Georg A. Kaiser & Elisabeth Stark (eds.), Pro- 
ceedings of the Workshop “Specificity and the Evolution/Emergence of Nominal Deter- 
mination Systems in Romance", 33-69. Universitát Konstanz: Fachbereich Sprachwis- 
senschaft (Arbeitspapier 119). 

von Heusinger, Klaus & Georg A. Kaiser. 2011. Affectedness and differential object mark- 
ing in Spanish. Morphology 21. 593-617. 

Weissenrieder, Maureen. 1985. Exceptional uses of the accusative a. Hispania 68(2). 393- 
398. 

Weissenrieder, Maureen. 1991. A functional approach to the accusative A. Hispania 74(1). 
146-156. 

Witzlack-Makarevich, Alena & Ilja A. SerZant. 2018. Differential argument marking: 
Patterns of variation. In Ilja A. Serzant & Alena Witzlack-Makarevich (eds.), Di- 


241 


Marco García García 


achrony of differential argument marking, 1-40. Berlin: Language Science Press. 


DOL10.5281/zenodo.1228243 


242 


Chapter 9 


Emergence of optional accusative case 
marking in Khoe languages 


William B. McGregor 


Aarhus University 


A number of languages of the Khoe family — one of three genetic lineages comprising south- 
ern African Khoisan - show an accusative marker, typically a postposition which in its 
elsewhere form has the shape (-)(?)d. In all languages for which adequate data is available, 
this postposition is optional on object NPs, at least in some circumstances. A few proposals 
have been made for the grammaticalisation of this marker, notably by Kilian-Hatz (2008: 55; 
2013: 376-378). However, not only are these proposals specific to the Khwe language, but 
also they fail to account for the fact that (-)(7)à marks the accusative and that it is optional. 
In this paper I widen the net to the Khoe family as a whole, and consider the synchronic sit- 
uations for the usage of the marker (-)(?)à and its putative cognates in those languages for 
which pertinent data is available. This is used to motivate a diachronic proposal concern- 
ing the grammaticalisation of (-)(7)à in the modern languages. Specifically, it is proposed 
that the accusative marker began life as a presentative copula; this served to index an item, 
drawing the addressee's attention to it. It later became an optional accusative marker via 
grammaticalisation processes akin to those outlined in McGregor (2008; 2010; 2013; 2017) 
for the development of optional ergative case markers in some Australian languages. Thus 
the grammaticalisation scenario proposed is consistent with pathways of development of 
other optional case-markers. 


1 Introduction 


11 Aims 


This paper is concerned with the grammaticalisation of optional accusative marking in 
the Khoe languages of southern Africa. I argue that the accusative marker, which gener- 
ally takes the elsewhere form (-)(7)á in Khoe languages, began as a copula in presentative 
clauses (McGregor 1997: 90—91, 307-310); see also Kilian-Hatz (2008: 55, 2013: 376-377); 
Kónig (2008: 276) for a similar suggestion. This was also employed to draw attention 
to certain NPs in verbal clauses, especially unexpected or atypical objects of transitive 
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clauses. It subsequently developed into an optional accusative marker in most of the lan- 
guages. This scenario is supported on the one hand by an examination of the range of 
synchronic uses of the accusative marker and possible cognates, and on the other by evi- 
dence from the grammaticalisation of other optional case markers, in particular optional 
ergative markers, as discussed in McGregor (2008; 2010; 2013). 

The paper is organised as follows. After providing an outline ofthe lineage, and sources 
of information on the languages in 81.2, the subsequent two sections set the scene for 
the grammaticalisation scenario proposed in 84. $2 presents a detailed overview of the 
uses of the marker (-)(?)d, including possibly homophonous and/or cognate morphemes 
in those Khoe languages for which information is available. Following this, $3 presents a 
discussion of the motivations that have been proposed for the choice between using and 
not using the accusative marker on object NPs in a small selection of Khoe languages - 
for the majority of the languages information on this issue is not available. The paper is 
wound up in $5 with a brief conclusion. 


1.2 The languages and sources of information 


Khoe is a branch of the putative Khoe-Kwadi family (Greenberg's Central Khoisan) 
(Güldemann 2004; Güldemann & Elderkin 2010; Voßen 1997; Vossen 2013c: 10-11), one 
of three distinct Khoisan lineages found in southern Africa. A tentative tree for Khoe- 
Kwadi is shown in Figure 1. 


Khoe-Kwadi 
Kwadi Khoe 
Khoekhoe Kalahari Khoe 
Nama-Damara, Hail'om, East West 
tAakhoe, Eini, !Ora, M dim M Sui axe ee 
Cape varieties 
Shua Tshwa Kxoe Glana Naro 


Cara,|Xaise, Kua, Cua, Khwe, l Ani, Glana, Glui, Naro, tHaba, 


Deti, Danisi, Tsua, Buga, Glanda , Tshila Ts’ao 
Nata Shua,  Tyiretyire [Xom, Do 
Ts'ixa 


Figure 1: A possible tree for the Khoe-Kwadi lineage (based on Gúldemann 
2014: 27) 


It should be noted, however, that there are a number of uncertainties: not all specialists 


agree that the evidence convincingly supports Kwadi as a sister of Khoe; the referents of 
the terms for many varieties are uncertain (e.g. Tyiretyire (Cirecire) and its relation to 
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Cua); the placement of some varieties is tentative (e.g. of Ts'ixa as a Shua variety); and 
Khoekhoe is sometimes divided into north and south (e.g. Gúldemann & Vossen 2000: 
102, cf. Güldemann 2014: 28). 

The main languages dealt with in this paper, along with the primary sources of in- 
formation on them, are listed in Table 1.! There is insufficient information on Kwadi to 
permit its inclusion in this study. Otherwise, all of the Khoe groups and subgroups are 
represented by at least one language; unfortunately, however, the data available for some 
subgroups is seriously inadequate. 


Table 1: Languages and main sources of data 


Group Language Main sources 
Khoekhoe Nama-Damara Hagman (1973) 
!Ora Haacke (2013b) 
East Kalahari Khoe Nata Shua Own fieldnotes 
Danisi Fieldnotes Fehn & McGregor; Vossen (2013a) 
Ts'ixa Fehn (2014) 
Tyiretyire Own fieldnotes 
West Kalahari Khoe Khwe Kilian-Hatz (2008; 2013) 
Ani Heine (1999) 
Glui Ono (2011) 
Naro Visser (2013) 


2 The marker(s) (-)(?)à in Khoe languages 


One or more grammatical markers showing the shape (-)(7)á are attested in all Khoe lan- 
guages that have been sufficiently well described;? these are found in languages of all 
three branches. There are a number of differences in the range of uses of these markers 
across the languages, as shown in Table 2- Table 4. Note that these tables identify gram- 
matical uses of morphemes with the shape (-)(7)à, regardless of whether or not they 
represent different uses of a single morpheme or distinct morphemes - which in many 


!In the remainder of the paper the term Shua will be used in reference to the variety spoken in Nata. Where 
reference is made to the set of varieties in the subgroup I will speak of Shua varieties. 

?Sources are inconsistent in representing an initial glottal stop. In some languages two distinct allomorphs 
exist, one with and one without an initial glottal stop. Various other allomorphs are found in particular 
languages, including allomorphs with different vowel shapes (usually conditioned by preceding segments) 
and fused allomorphs (often morphologically conditioned by a preceding person-gender-number marker 
or pronoun). Discussion of the allomorphy is beyond the scope of the present paper, although it is clearly 
crucial to a complete and convincing grammaticalisation story. 
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Table 2: NP role marking functions of morpheme or morphemes (-)(7)a 

(1 (almost) certainly a use of the form in the language: either attested or implied 
by the description; 0 evidence suggests not a use of the form in the language; — 
unattested use in the language, though information is insufficient to determine 
whether it is a possible use.) 


Language Object Indirect Subject Topic Focus Locative Genitive 
Object 
1 1 0 0 0 1 — 
Shua 
Ts'ixa 1 1 0 0 0 1 = 
Tyiretyire 1 1 0 0 — — E 
1 1 1 0 1 1 1 
Khwe 
[Ani 1 1 1 1 = 0 0 
Buga 1 1 SS 0 — 0 0 
Naro 1 1 = 0 = 0 0 
Glui 1 1 = 0 0 0 0 
Nama- 1 1 1 0 0 1(temp) 0 
Damara 
!Ora 1 1 1 — — — — 
Haillom 1 1 1 — = m T 


cases is not known for certain.®4 Nor has it yet been established that the morphemes are 
all cognate. Moreover, the listing is incomplete. For expository purposes I have been se- 
lective, and excluded those functions (and possibly morphemes) that are irrelevant to the 
grammaticalisation scenario proposed in this paper. For instance, most Khoe languages 
have a verbal juncture morpheme, one allomorph of which is -a (Vossen 2010). Whether 
or not this morpheme is cognate with (-)(2)à, it plays no role in the grammaticalisation 
scenario proposed in §4. 

Two functions are universally associated with (-)(?)à in Khoe languages. First, in every 
language (-)(7)à is attested as a marker of both direct objects and indirect objects. This is 
illustrated in the Khwe example (1), where the marker is a free postposition, as in other 
Kalahari Khoe languages. In at least some of these languages there is an allomorph that 
fuses with a preceding pronoun or person-gender-number (PGN) marker, a portmanteau 
morph attached to a nominal and encoding its person, grammatical gender and number. 
For instance, in Shua one finds Pitama:  - Pita-ma_-?a_ (Peter-M-Acc) ‘Peter’ and ta a. 


3For this reason, I adopt the convention of glossing (-)(?)à according to its putative function, rather than 
with a single gloss, except where the evidence indicates that a single morpheme is involved. It should not, 
of course, be presumed that each gloss corresponds to a different, homophonous morpheme, although it 
may. 

^Kilian-Hatz states explicitly that there is a single morpheme (?)à in Khwe with the range of senses indicated 
in Table 2-Table 4 (Kilian-Hatz 2008: 52-53, 2013: 368). Whether or not this proposal is viable remains 
unclear to me. 
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Table 3: Other types of NP marking by morpheme(s) (-)(?)à 


Language Dislocated NPs ` Appositive NP Os Governed by postposition 
0 — 0 
Shua 
Ts'ixa 0 s 0 
Tyiretyire 0 = 0 
dn 0 1 (attributing) 1 
Ani 0 = 0 
Buga — — — 
Naro = — 0 
Glui 1 1 (identifying) 0 
Nama-Damara a = 1 
!Ora — = zx 
Hailom x B — 


Table 4: Other uses of morpheme(s) (-)(?)à 


Language Relational Presentative Clausal Extraposed 
copula copula connector elements 
0 0 0 
Shua 
Ts'ixa 0 0 — 0 
Tyiretyire 0 0 = 0 
1 = 0 
Khwe 
Ani 1 = = 0 
Buga E — — — 
Naro 0 0 1 — 
Glui 1 — — 1 
Nama-Damara 1 — 1 — 
lOra 1 1 = = 
Haillom 1 = = = 


~ ta: -?a  (1s56-AcC) ‘me’. In Khoekhoe the corresponding marker is a suffix, as shown by 
the Nama-Damara example (2).? 


(1) Khwe (West Kalahari Khoe; Kilian-Hatz 2013: 374) 
matiaci-m a (og d H — xaró-ná-tá 
Matthew-3sG.M Acc money Acc 1sG give-J-PST 


‘I gave money to Matthew: 


?Haacke's (2013b) construal of the morpheme -à as an oblique marker seems preferable to Hagman's (1973) 
construal as a subordinate case marker, and I adopt it in this paper. 
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(2 Nama-Damara (Khoekhoe; Hagman 1973) 
Táo-p ke tará-s-à péré-p-a kè | máa 
man-3sG.M DECL  woman-3sc.F-OBL bread-3sG.M-OBL PST give 


"Ihe man gave the woman bread: 


As example (3a) shows, in Shua the erstwhile beneficiary (an indirect object) in an 
applicative construction is marked by the Acc (7?)à; this marking may be retained under 
passivisation, as shown by (3b). 


(3) Shua (East Kalahari Khoe; own fieldnotes) 
a. taa a pü tyana-ma 
lsg Acc milk bring-APPL 
“Bring me some milk. 
b. tse: fa aka kohu  nglo:-a-ma-e-ha 
ÍPL.C ACC PST meat  cook-J-APPL-PASS-PST 


“The meat was cooked for us? 


Second, in both Khoekhoe and West Kalahari Khoe (-)(7)à is widely attested as a rela- 
tional copula, that is, as a copula in attributing and/or identifying clauses. Example (4) 
illustrates this usage in the West Kalahari Khoe language lAni.Ó For Nama-Damara Hag- 
man (1973: 114-116, 164) identifies a present tense copula 7a that is used in attributing 
clauses, as shown by example (5a). He also identifies a suffix -à that is used as a marker 
of the "predicate" in identifying clauses (Hagman 1973: 110), as in example (5b); this also 
appears to exemplify a copula function (see also Vofen 1997: 174 and Haacke 2013b: 342 
for brief remarks on !Ora.) 


(4) lAni (West Kalahari Khoe; Heine 1999: 24) 
kx'oxu  tshaa-kx'oxu-dzi fa 
animal  water-animal-3PLF COP 


“Water animals are edible [are meat]. 


(5 Nama-Damara (Khoekhoe; Hagman 1973: 116) 


a. saá-ts ke fa kai 
2-28SG.M DECL COP big 
“You are big: 


b. saá-ts ke kai-ts-a 
2-25G.M DECL  big-25G.M-COP 


"You are the big one? 


Copula usage may also be available in Glui. Nakagawa (2013: 400) speaks of a linking 
use of -à that incorporates a body part nominal into a complex adjective, as in (6). 


“Heine (1999) equivocates on the status of this marker as a morpheme distinct from the Acc marker. 
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(6) Glui (West Kalahari Khoe; Ono 2011) 
fabi oi ja t4ó 
3sc.M good cor heart 


“He is good in the heart? i.e. “He is happy? 


An alternative and more plausible analysis is that ja (an allomorph of à) does not link the 
adjective ‘good’ with the nominal ‘heart’ into a complex adjective, but rather functions 
as a clausal copula in an external possession construction of the double subject type. 
That is to say, in (6) goodness is attributed of the person, and the following body part 
nominal indicates a restriction of the attribute to the person's heart - they are good in, 
or with respect to, the heart. 

In addition to its use as a relational copula, in Khwe and !Ora (?)à can be used as a 
presentative or existential copula (Tom Güldemann p.c.; Kilian-Hatz 2008: 52), as shown 
by examples (7a) and (7b). 


(7) Khwe (West Kalahari Khoe; Kilian-Hatz 2013: 251) 


a. thíyà  goava a 
many Mbukushu cop 
“There are many Mbukushu. 
b. om! tea lei-coava a! 


come.near 2sG.M skin-be.rotten cop 


“Come near! Here is your rotten skin [i.e. your food]! 


The copula function is not attested in any East Kalahari Khoe language to the best of 
my knowledge. 

Other uses of (-)(7)à are rather sporadically distributed across the Khoe languages, at 
least given the existing evidence. I briefly overview these additional senses. 

In addition to marking direct and indirect objects, in Khwe, lAni, Nama-Damara and 
!Ora (-)(?)à occurs on subject NPs as well, albeit rarely. In the latter two languages -à 
occurs on what is referred to as a “deposed” subject, that is, a subject that does not occur 
in its usual clause-initial position, as in example (8). Unfortunately, the descriptions do 
not make entirely clear either the formal properties of this construction or its meaning 
and uses (Hagman 1973: 203; Haacke 2013b: 341, 2013a: 328). 


(8) Nama-Damara (Khoekhoe; Haacke 2013a: 328) 
tsi-b ge  axa-b-a lóa-s-a tsaurase go tai 
and-3sG.M IND  boy-3se.w-oBr  girl-3sG.r-OBL gently Ser call 


“And then he, the boy, gently called the girl’ 


7Strictly speaking, this is of course not a copular function in that it does not connect linguistic forms; indeed, 
it better resembles the index there of the English presentative/existential than the copula be. However, I 
follow usual convention and use the term copula loosely in this fashion; it is not unreasonable in the sense 
that what is linked is the addressee's attention and the referent item. 
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In Nama-Damara -à also regularly occurs on subjects of some clauses in marked 
moods, the interrogative and imperative/hortative (Hagman 1973: 260, 270-271). 

In Khwe subjects of both relational and verbal clauses can be followed by (7)à, though 
only when indefinite; for subjects of transitive clauses this marking is extremely rare 
(Kilian-Hatz 2008: 51-52, 2013: 369—371), slightly more common for intransitive subjects. 
Kilian-Hatz (2013: 370) considers that in these contexts (7)à serves as a focus marker 
rather than as a subject marker, as in (9). 


(9) Khwe (West Kalahari Khoe; Kilian-Hatz 2013: 370) 
kúcugucugu à lgevúu-a-te 
eagle roc fly-j-Psr 


“An eagle is flying: 


More generally, Kilian-Hatz considers that the primary NP marking function of (lg 
in Khwe is to mark focus, regardless of what other grammatical role is simultaneously 
borne by the NP, whether it be a core grammatical relation or the locative — see espe- 
cially Kilian-Hatz (2008: 54, 2013: 370, 377). A secondary function is to mark the object; 
this happens only (in Kilian-Hatz's view) in those circumstances in which the object 
is obligatorily marked by Ca, namely on proper noun objects and on indirect objects 
other than specific nouns (see further 83.3 below). Consistent with this, only one NP ina 
clause is normally marked by (?)à. In no other Khoe language has it been suggested that 
(-)(2)a is a general focus marker. By contrast, Heine (1999: 31, 67, 68) suggests that (7)à in 
[Ani serves as a topic marker, perhaps primarily. He does not, however, explain what he 
means by "topic" and the examples given could as well be interpreted as invoking focus 
on the marked NP. 

In both Shua and Ts'ixa (7)à marks a general locative case, as shown in (10). However, 
in both languages this form represents a different postposition to the Acc marker: it 
shows different allomorphy and occurs with a different case form of the PGN markers 
(Fehn 2014: 202; McGregor 2015). 


(10) Ts'ixa (East Kalahari Khoe; Fehn 2014: 202) 
kolóí-sí I?aé=m fa téé 
car=sG.F:I village=sc.m:1 Loc  be.standing 


‘The car stands in the village’ 


In Nama-Damara -å occurs on NPs indicating time units, marking temporal duration 
(Hagman 1973: 112, 199). 

In Khwe there is a genitive case suffix -d that is used in the expression of part-whole re- 
lations within NPs when the whole (modifying) nominal is indefinite (Kilian-Hatz 2008: 
77). Examples are gà-á (00 (sheep-GEN hair) ‘sheep’s wool’, xúni-a khóó (crocodile-GEN 
skin) ‘crocodile’s skin’, and hémpe-a pi (shirt-GEN pocket) ‘pocket of a shirt’. This is 
cognate with the postposition (7)a according to Kilian-Hatz (2008: 55). 
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The features identified in Table 3 are mostly language specific, and most are poorly 
exemplified and described in the sources. The one in the final column, ‘Governed by 
a postposition' is something of an exception, and is attested in both Khwe and Nama- 
Damara. In Khwe, PPs with postpositions other than à - i.e. with local postpositions — 
pronouns and PGN-marked (i.e. definite) NPs take the postpositions directly while non- 
PGN-marked (i.e. indefinite or non-specific) NPs are marked by -à GEN (Kilian-Hatz 2008: 
64), as in example (11). 


(11) Khwe (West Kalahari Khoe; Kilian-Hatz 2008: 66) 


til kóánáci ki tcá cà-á ki lóé-è-lòè nò cé 
then because LOC 2SG.M  water-GEN LOC lie-ATV-HAB CON 1PL.F 
té-é-lòè kó dì xóm-d ki 


stay-ATV-HAB dry Poss sand-GEN LOC 


“Since you are used to lying in the water, and we are used to staying in the dry 
sand [it is not good to come with us]. 


Similarly in Nama-Damara an NP marked by one of the three local postpositions "oá 
ALL (optionally), xuú ABL, or "id PER selects the oblique suffix -å following its PGN marker 
(Hagman 1973: 112, 192-193). 


(12) Nama-Damara (Khoekhoe; Hagman 1973: 192-193) 
farip ke fom-s-a xuu ke peé 
dog | DECL house-3sG.F-OBL ABL RPST go:away 


"Ihe dog went away from the house. 


In both Khwe and Glui (7)à can occur on an NP in apposition with the object of a clause, 
as in (13). The examples in Khwe all involve an attributive relation between the second 
NP and the first; by contrast, in Glui they involve identification. These restrictions may 
be simply an artifact of the small number of tokens given in the sources, and none of 
the sources mention the restriction on the grammatical role of the NP attributed on or 
identified, although all of the illustrative examples satisfy this condition. I suspect that 
this usage is more widespread in Khoe languages. 


(13) Glui (West Kalahari Khoe; Ono 2011: 2) 
da ci nloori-xa-na lao nje=na (7a) 
IsG.IRR  1sG.GEN grand:junior-with=PL.C.acc insult DEM=PL.C.ACC ACC 


‘Let me insult my grandchildren, these ones? 


In Glui according to Ono (2011), as shown in Table 3, (7)à occurs on dislocated NPs, 
by which she apparently means NPs set off on their own intonation contour and either 
preceding or following the remainder of the clause. The free translation for example (14) 
suggests that these are a type of cleft construction. This is the only example given, and 
it is not known whether the dislocated NP can bear any role other than object. 
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(14) Glui (West Kalahari Khoe; Ono 2011: 2) 


fa ja k|oá-ki-sa (Ya) tsa glae-si aaku 
DEM conj child-Foc-sG.F.ACC IDTF 2SG.M.GEN  Wwoman-sG.F.NOM come 
faba-g-ntoé 
strap-J-sit 


‘Tt is the child who your wife is strapping to her back’ 


Ono (2011: 2) also says that (lä can be used to mark dislocated clauses. However, 
just one example is given, and in this example (7)à might be interpreted as marking a 
complement clause. 

There are a few attestations of (-)(?)à as a clausal connector. In Naro a can be used 
to connect a subordinate clause to a preceding main clause, according to Visser (2001: 1, 
2010: 180). In Nama-Damara -à can mark an indirect speech report, as in example (15). 
According to Hagman (1973: 256), the subordinator -d is attached to the indirectly quoted 
clause plus complementiser /hai, which forms a single NP syntagm; the analysis provided 
in Haacke (2013b: 345), although inexplicit, is consistent with this parsing. Note that 
the -à is usually attached to an instance of /hai-s (that-3sc.r) in final position in the 
complement clause; occasionally, however, the connector /hai is omitted and the PGN 
marker is directly connected to the final word of the indirect quote (it is possible that 
this function is also served by (2)a in Glui.) 


(15 Nama-Damara (Khoekhoe; Haacke 2013b: 345) 
ots kara mú-ba-sen lgam-he khom ra thai-s-a 
2sg POT.PROG see-APPL-REFL kill-pass 1DU PRs.ROG that-3sG.F-COMP 


“Then you may see for yourself that we are killed’ 


A use not specifically indicated in the tables above is found in Khwe alone. This use is 
in possessive NPs, where (2)a marks an indefinite possessum - i.e. one that is not marked 
by a PGN marker (Kilian-Hatz 2008: 70, 73). Kilian-Hatz treats this as an instance of the 
copular function of (?)à. An example is given in (16). 


(16) Khwe (West Kalahari Khoe; Kilian-Hatz 2013: 70) 
tá-khó-m di nigóá à 
old-Ac-3sc.M Poss  walking.stick cop 


‘the old man's walking stick’ 


Finally, it should be remarked that in Khoekhoe, with one exception, the morphemes 
discussed above are suffixes that invariably follow a PGN marker (Hagman 1973: 33-34; 
Haacke 2013b: 341). Probable cognates of these suffixes are found in the final à vowel 
in one series of PGN markers in most Kalahari Khoe languages, including Shua (Mc- 
Gregor 2014: 49), Tyiretyire (my fieldnotes), Ts'ixa (Fehn 2014: 62-64), IGui (Nakagawa 
1993 cited in Fehn 2014: 315) and possibly in lAni (Heine 1999: 26-28) and Eastern lAni 
(Fehn 2014: 315).? The d-series of PGN markers serve a different range of functions in 


8The situation in Khwe seems to be somewhat different, and Kilian-Hatz (2008: 40-41) does not distinguish a 
distinct PGN series in d; she treats the different forms of the PGN markers in the third person as allomorphs. 
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each of the languages, but in all languages it is this series that is used on object NPs 
(Kilian-Hatz 2008: 40-41; McGregor 2014: 49; Fehn 2014: 228, 315). It seems likely that 
the suffix -å and final à vowel are both cognate with the free form (lg of Kalahari Khoe 
languages, the latter having been added via the diachronic process sometimes referred to 
as "doubling" or "reinforcement". The fact that similar environments of use and patterns 
of optionality are found in Kalahari Khoe and Khoekhoe languages lends some support 
to this hypothesis. 


3 Optional accusative marking in Khoe languages 


As has been shown, in all Khoe languages for which there is sufficient data (-)(7)à can 
mark both direct and indirect objects; Hailom is the only language where this use is not 
mentioned or exemplified in a basic syntactic description (Widlok 2013b).? In Kalahari 
Khoe languages it is a phrase-level marker that occurs in NP final position, normally as 
a separate word or clitic, though sometimes fused with the final word. In Khoekhoe it 
appears to be an inflectional suffix. 

In almost all Khoe languages (-)(?)d is optional as a direct object marker in the sense 
of McGregor (2010: 1610-1613, 2013: 1152).? First, it may be present or absent on a direct 
object NP without affecting the grammatical role borne by that phrase. There is no reason 
to believe that the NP serves a different grammatical role when (-)(7)à is present/absent, 
and that in one instance it is not an object; nor (as far as Iam aware) has any investigator 
suggested that it has. Second, the presence or absence of (-)(7)á is not predictable from 
grammatical characteristics of the clause in which it occurs. Both conditions appear to 
obtain in all East and West Kalahari Khoe languages, and in Khoekhoe at least in !Ora 
(Haacke 2013b: 341). Nama-Damara is a probable exception. According to Haacke (2013b: 
341) -å is consistently used on object NPs, in contrast with !Ora. Hagman (1973) makes 
no reference to the optionality of this marker, and, given that he discusses optionality of 
a range of other morphemes, his description implicates that it is obligatorily used. One 
context in which -à does not occur on object NPs in Nama-Damara is in relative clauses, 
where the object NP occurs in final position and effectively serves as a relative clause 
relator (Hagman 1973: 230-231). Being a grammatically conditioned absence, this does 
not count as an instance of optionality.! However, in certain other environments the 
marker is perhaps optional, including when preceding the allative postposition and on 
indirect speech complement clauses (see discussion of example (15) above). 


?It seems likely that the uses of -à in Hailom are comparable with those of the cognate morpheme in Nama- 
Damara and !Ora. Widlok (2013a: 158) indicates that there is an oblique suffix -a that attaches to the PGN 
marker of an NP. Although its usage is not discussed in Widlok (2013a), it presumably marks objects (both 
direct and indirect) and subjects as in the other two Khoekhoe languages. 

10This phenomenon has also been referred to as “differential object marking” (DOM). I have suggested, how- 
ever, that this term as generally used covers a disparate range of phenomena which need to be distinguished 
(e.g. McGregor 2010: 1613). In particular, the situation in which a single morpheme may be present or ab- 
sent on an object NP must be distinguished from the situation in which an object NP can be marked by 
two different morphemes. 

Dit does however fall within the range of phenomena commonly dubbed DOM (see previous footnote), what 
the editors refer to in the introductory chapter as “clause-type-based differential marking". 
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The situation for the marking of indirect objects in Khoe languages seems rather dif- 
ferent, at least in those languages for which information is available. On indirect object 
NPs - which are prototypically human - (-)(?)à usually appears. This is the case in Khwe 
(Kilian-Hatz 2008: 51, 56, 63), as in example (17), Shua (my own fieldnotes), and Ts'ixa 
(where in the majority of examples cited in Fehn 2014 are marked either by the Acc or 
the DAT postposition; few are unmarked). In the remainder of this section I focus on the 
marking of direct objects, excluding indirect objects from the exposition. 


(17 Khwe (West Kalahari Khoe; Kilian-Hatz 2008: 63) 
matiaci-m a (od d H  xaró-á-ta 
Matthew-3sG.M Acc money Acc 1sG_ give-J-NPST 


‘I gave money to Matthew: 


Existing accounts say little about the motivations for use vs. non-use of (-)(?)à on direct 
object NPs. Indeed, a number are silent on the issue, as in the brief treatments of Danisi, 
Deti, Cara and Kua morphology and syntax in Vossen (2013b). One has to examine the 
examples given in the papers to discover that the marker is not always present on object 
NPs. 

I would argue that the usage-based theory of optional case marking elaborated in e.g. 
McGregor (2010; 2013) accounts for the optional accusative in Khoe languages. In what 
follows I provide a brief overview of the main features of this theory; see McGregor 
(2006; 2010; 2013) for more detailed discussion. 

Two fundamental assumptions of the theory are: (a) within particular constructions 
case-markers index specific grammatical relations; and (b) either use and/or non-use 
of an optional case-marker can potentially encode a meaning (again within the spec- 
ified construction). The first assumption would seem to be uncontroversial, and is as- 
sumed by most grammarians: for instance, in a transitive construction (with only the 
inherent grammatical roles), a case-marker such as the accusative will mark a particular 
role, namely the object. The second assumption is perhaps more controversial. It applies 
specifically to optional case-markers and asserts that a meaning may be coded by using 
and/or not using the case-marker in the environment of its optionality. These assump- 
tions imply that there are two possible loci of meaning: the case-marking morphemes 
themselves and their usage or non-usage. 

A case-marker indexes the grammatical role(s) that it marks; this is its meaning. As 
a consequence, it indirectly and symbolically conveys the meaning associated with that 
grammatical role - where I presume, along with various functionally oriented theories, 
that grammatical categories, including roles such as subject and object, are meaningful 
(e.g. Haas 1954; Halliday 1985: 30-32; Langacker 1987: 275, 316, 1991: 289; Shaumyan 1987: 
27; McGregor 1997: 2). 

The meanings associated with use or non-use of an optional case-marker do not, by 
definition, concern grammatical relations; rather, they relate to the domain of joint at- 
tention, to the integration of information into the joint-attentional frame (Tomasello & 
Farrar 1986; Tomasello 2003). My proposal is that use of an optional case-marker can 
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serve to accord particular attention to the marked grammatical role or its filler, singling 
it out as the centre of attention — in other words, highlighting it (McGregor 2013: 1157). 
By contrast, non-use of an optional case-marker may serve a backgrounding function, 
shifting the role or its filler outside of the domain of the joint-attentional frame, assign- 
ing it to the domain of what is presumed by the speech interactants, to the common 
ground at that point in the speech interaction. The point of the modal qualifications of 
the previous two sentences is that if a meaning is conveyed - i.e. coded - by use or 
non-use of a marker it will be of the type specified; it is also possible that no meaning is 
conveyed by either or both. 

It is convenient for descriptive and comparative purposes to assign feature labels to 
the two possible meanings, [prominent] and [backgrounded], and to allow them to take 
values + (specifying that the feature is coded and thus marked), and - (the unmarked 
value of the feature, where it is not coded and no meaning of the specified type is con- 
veyed). There are thus four coding possibilities for use or non-use of an optional case 
marker, as shown in Table 5. 


Table 5: Meanings potentially coded by presence and absence of an optional 


marker 
Use No meaning Meaning No meaning Meaning 
[-prominent] [+prominent] [-prominent] [+prominent] 
Non- No meaning No meaning Meaning Meaning 
use [-backgrounded] [-backgrounded] [+backgrounded] [+backgrounded] 


The two features [prominent] and [backgrounded] correlate with expectedness in the 
grammatical role: prominence is naturally assigned to something that is unexpected, 
while it is natural for something completely expected to be backgrounded. These fea- 
tures are intended to capture the commonality in the cross-linguistic diversity in the 
actual meanings associated with usage and/or non-usage. They contextualise in a range 
of different ways in different languages and constructions, depending in part on how 
the notion of expectedness is construed, on what sense of (un)expectedness is associ- 
ated with prominence and/or backgrounding. For instance, among other possibilities, it 
may concern the prototypical likelihood of the referent of a particular type in that role; 
it may concern the likelihood of the particular referent in the role in the specific token; 
it may concern the identity of the filler of the object role. 

For just three Khoe languages is some discussion of motivations for optional accusative 
case-marking available: Shua (McGregor 2015), Ts'ixa (Fehn 2014) and Khwe (Kilian-Hatz 
2008; 2013). These are overviewed in the following three subsections, respectively. For 
the other languages little can be said given the absence of discussion in the sources and 
the paucity of examples - though the Ani texts in Heine (1999) are probably quantita- 
tively sufficient to warrant examination. 
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3.1 Shua (East Kalahari Khoe) 


In Nata Shua the frequency of use of the Acc marker (7)à differs according to the posi- 
tion of the object NP on an animacy scale (McGregor 2015): on personal pronouns (i.e. 
pronouns other than the 3sc.c ‘it’) and personal names the Acc is (almost) obligatory; 
on PGN-marked lexical NPs it is quite common, though not obligatory; on ordinary lex- 
ical human NPs unmarked by a PGN marker it is relatively infrequent; on lower-order 
animate NPs and inanimates it is rare; on mass inanimates the Acc marker is not at- 
tested. Frequency of use of the optional marker in specified environments is indicative 
of semantic markedness in the respective contexts, as per Levinson's I and M heuristics 
(Levinson 2000) and the observation that the two types of markedness often correlate. 
McGregor (2015) proposes that in Shua use and/or non-use of the Acc serves to either 
make the direct object prominent or to background it, depending on the animacy of the 
direct object NP, as shown in Table 6, where grey background indicates uncertainty due 
to paucity of examples. 


Table 6: Meanings of presence vs. absence of Acc on different NP types in Shua 


NP type Acc marker present Acc marker absent 
Personal pronouns No meaning Direct object 

backgrounded 
Personal names No meaning Direct object 

backgrounded 
PGN-marked NP Direct object prominent Direct object 

or no meaning backgrounded 

Other human NP Direct object prominent No meaning 
Non-human animate & Direct object prominent No meaning 
inanimate 


For personal pronouns and names Acc marking is (almost) always present, and thus is 
unlikely to convey a meaning; by contrast the absence of the Acc marker (if permissible) 
can be expected to background the object. For PGN-marked NPs the Acc marker is also 
very frequent and can be expected to convey no meaning; however, if it is absent the 
direct object is backgrounded. For human NPs of other types the Acc marker is normally 
absent, and this most likely conveys no meaning; its presence, by contrast, is rather rare, 
and marks the direct object as prominent. Similarly for non-human animates and inani- 
mates the presence of the Acc marker assigns prominence to the direct object, whilst its 
absence, the usual situation, conveys no specific meaning - the direct object is neither 
made prominent nor is it backgrounded. 
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The present paper is not the place to present arguments for these claims, which are 
discussed in McGregor (2015). I illustrate here just the claim for NPs of the lowest ani- 
macy. For these NPs McGregor (2015) shows that two types of consideration are relevant 
to the choice of making a direct object prominent. One set of considerations concerns 
identity, in particular whether or not the referent is expected as the filler of the direct 
object role in the particular discourse circumstances. This is illustrated by example (18a), 
which comes from a description of a drawing in the wordless picture book A boy, a dog, 
and a frog (Mayer 1967) in which the boy has netted his dog in a fishing net. The preced- 
ing drawings construct a story in which it is expected that the boy will net the frog; the 
identity of the object referent is thus unexpected. Usually in Shua, as in this example, 
prominence is assigned by use of the acc marker when the direct object referent is se- 
lected from an already established set of referents, from an established space of potential 
referents. Less often, it is assigned when the direct object referent contrasts with another 
potential filler of the role, as in (18b). 


(18) Shua (East Kalahari Khoe; own fieldnotes) 


a. aba: fa ema lam-rekareka 
dog acc 3sc.M  hit-maybe 


“Maybe he is hitting the dog’ 
b. ta: aka ke lori fa mů: ta: aka sekuskara ?a  mú:-ta 
Je PST IPFV truck acc see 1sG PST donkey.cart ACC see-NEG 


‘I saw the truck, not the donkey cart. 


The other set of considerations concerns the degree of patientivity of the direct object 
referent. Consider (19), which describes an event in which a tent is completely destroyed 
by hail; it is affected to a higher degree than might be expected - it might be expected 
that a tent is knocked down, though not necessarily completely wrecked. In this instance 
the unexpectedly high degree of effect on the tent motivates making the direct object 
prominent by marking it with the Acc postposition (identity considerations are irrele- 
vant in this instance.) Other descriptions of similar events in which the tent is not so 
heavily affected by the event (not torn to shreds), or in which it is just rain that did the 
job, did not employ Acc marking on the object NP. 


(19) Shua (East Kalahari Khoe; own fieldnotes) 
hé:xo: fa tu:-a-ta tu: kom ka tante Zo bo:ru-hu-a-ha 
this LOC rain-j-pstT rain hail INS tent acc_ hole-cAus-j-pst 


“The rain that rained here with hail tore the tent to shreds’ 


Note that it is not suggested that the absence of the acc marker indicates a lower degree of effect on the 
direct object, only that it is consistent with lower affectedness of that entity. Absence of the marker on 
inanimate NPs, as indicated in Table 6, conveys no meaning. 
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3.2 Ts'ixa (East Kalahari Khoe) 


Fehn (2014: 231) sums up the meanings associated with use of the acc marker (7)d in 
Ts'ixa as shown in Table 7 (slightly modified), where the definiteness of a lexical NP is 
dependent on the presence of a PGN marker. In contrast with the situation for Shua, in 
Ts'ixa word order is relevant to the choice of using or not using the Acc marker. 


Table 7: Meaning of Acc marking on definite and indefinite NPs in Ts'ixa 


Word order Definite NP: personal pronoun Indefinite lexical NP (not 


or PGN-marked lexical NP PGN-marked) 
SO 7à V No meaning (Acc obligatory) Contrastive focus 
SVO ?a No meaning (Acc obligatory) No meaning (Acc precluded) 
EK) Contrastive focus Contrastive focus 


It will be observed that no information is provided in Table 7 about meanings associ- 
ated with non-use of the Acc marker in Ts'ixa. However, it is likely that non-use never 
conveys meaning. For indefinite direct objects the Acc marker occurs rarely (Fehn 2014: 
229); its absence is the norm, and may reasonably be presumed to convey no meaning. 
For OSV clauses the fact that accusative marking on both definite and indefinite direct 
objects has the same meaning leads one to expect that its omission is similarly motivated 
in each case, and thus has no meaning. 

It is worth observing in passing that the facts of optional accusative case marking 
in Ts'ixa are not well accounted for by the disambiguation theory, according to which 
the case-marker is used when there is a possibility of confusion between which roles 
are borne by the NPs of a transitive clause, and not used when there is no likelihood of 
confusion. First, the most common word orders in Ts'ixa are SOV and SVO, which are 
about equally frequent (Fehn 2014: 214). In these unmarked word orders definite object 
NPs are obligatorily marked by the Acc, whilst in the more marked OSV word order 
the marking of the direct object is optional. Second, the Acc occurs either obligatorily 
or optionally (in OSV clauses) where it is not required to disambiguate of the fillers of 
the subject and object roles: on PGN-marked direct objects, where the form of the PGN 
marker indicates the grammatical role of the NP. By contrast, for direct objects that 
are not PGN-marked - and thus where the NPs denoting them are not morphologically 
distinct from non-PGN-marked subject NPs - the Acc marker is optional or precluded. 

Fehn (2014) suggests that in all circumstances where the Acc is optional that its pres- 
ence assigns contrastive focus on the object. This is illustrated in the following exchange, 
invented by a native speaker to illustrate the meaning difference between the presence 
and absence of the Acc marker. (20d) in particular illustrates contrastive focus on the 
object. 
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(20) Ts'ixa (East Kalahari Khoe; Fehn 2014: 230) 


a. maá "9 tsá l'árh-nà-tà 
who Acc 2sG.M  beat-J-SDPST 
"Who did you beat?' 

b. k’aro=ma tí |— l'árh-nà-tà 
boy-3sc.M.1 1sG  beat-j-sDPST 
‘I beat the boy? 

c. faba=ma tsá l'ára-nà-tà 


dog=3sG.M.11 2sG.M  beat-j-sDPST 


"Did you beat the dog?' 

d. dt ?abá-mà tí l'ára-nà-tà fíté | k'aro-mà fà tí 
no dog=3sc.M.11 Ise  beat-J-SDPST NEG  boy-3sc.M.1 ACC 1sG 
l'árh-nà-tà 
beat-J-SDPST 
“No, I did not beat the dog, I beat the boy. 


However, a number of other examples provided in Fehn (2014) appear not to exemplify 
contrastive focus, but rather, as in Shua, the presence of the Acc functions to select a 
referent (type) from a set of presumed entities or entity types. This is illustrated by (21b) 
- compare this example with the neutral (21a), which does not involve an instance of the 
ACC postposition. 


(21) Ts'ixa (East Kalahari Khoe; Fehn 2014: 229) 


a. xam=ma tél — l'üü-á-tá 
lion=3sG.M.1 3PL.M_ kill-j-sppst 
‘They killed the lion’ 

b. xarh=ma ta é —lüü-á-tá 


lion=3sG.M.1 Acc 3PL.M_ kill-J-sppst 


‘They killed the lion (and not something else)? 


Indeed, there are other environments in which use of the Acc does not assign con- 
trastive focus to a direct object. One such situation is when the object is human and 
indefinite (Fehn 2014: 232), as in example (22). Here again it is possible that prominence 
is assigned to the object by use of the Acc in view of selecting the relevant entities from 
the class of available ones: a possible interpretation of this example is that it narrows 
down to Khoe people as possible speakers, from the set of all persons who might speak 
at Gloxa-Hill (there may well be other possible explanations for this are consistent with 
assigning prominence to the object NP ‘people’, but more detailed knowledge of the 
discourse environment would be required to permit evaluation). 
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(22) Ts'ixa (East Kalahari Khoe; Fehn 2014: 232) 
glóóxà-m ^ ngüà  lú.xud tsá kó khoe fà kom kuí 
GN=3SG.M.I LOC sometimes 2SG.M IPFV person Acc hear speak 
kó-sé 
IPFV-ADV 


“At Gloxa-Hill, you can sometimes hear people speak. 


To sum up, the facts as outlined above for Ts'ixa are not inconsistent with my propos- 
als for the motivations for using and/or not using the Acc postposition. In particular, use 
of the Acc in the environments in which it is optional assigns a high degree of promi- 
nence to the object. What is significantly different from Shua is that only identity of 
the object referent seems to be a relevant consideration in making the object prominent; 
considerations of the degree of patientivity of the object appear not to be pertinent in 
Ts'ixa (cf. example (19) above). To make the case watertight requires more evidence, in 
particular, more information on frequencies - especially on the frequency of marking 
vs. non-marking of definite object NPs in clause-initial position, and on the effects of 
animacy on frequencies. 


3.3 Khwe (West Kalahari Khoe) 


The situation in Khwe is somewhat murky, despite the extensive treatment in Kilian- 
Hatz (2008; 2013). The basic facts concerning the distribution of the marker on NPs in 
core clausal roles appear to be as shown in Table 8, which excerpts relevant information 
from Table 12 of Kilian-Hatz (2008: 47) and Table 15 of Kilian-Hatz (2008: 56). 


Table 8: Accusative marking of core NPs in Khwe 


Definite NP Indefinite NP 
Pronoun Proper noun Specific Generic Unspecific 
PGN (Da PGN (2à PGN (7) PGN Gë PGN (Zà 
Intransitive clauses 
S n 8 E = de = + Rs 


Transitive and ditransitive clauses 


S + - - - + - (+) (+) 
O + Ł + + + +(-) + + 
IO + + + + + +(-) - + - + 


^+ “with marker’; - ‘without marker’; + “optional marker’; () - rare. 


As already mentioned, Kilian-Hatz considers (7)d to be an object marker only in those 
contexts in which it is obligatory on NPs in that role - ie. proper noun objects and 
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indirect objects other than specific nouns; elsewhere she takes it to be a focal marker. 
About two thirds of direct object NPs overall are (?)d-marked (Kilian-Hatz 2013: 376). 
Kilian-Hatz suggests the following generalisations for the marking of direct objects in 
those circumstances in which it is not obligatory (Kilian-Hatz 2008: 59-61, 2013: 371- 
372): 

Specific/definite objects 


* Use of (?)à is motivated by: 
— Possibility of confusion as to who is acting on who 


— Contrastive contexts, including selective contexts 


e Non-use of (7)à is motivated by: 


— Presence of another NP marked by (?)à - e.g. on the subject or indirect object 
(see Table 8) 


- No possibility of confusion as to who is acting on who 
Non-specific/indefinite objects 


e Non-use of (7)à is motivated by: 
- No possibility of confusion as to who is acting on who 


— Presence of another NP marked by (?)à 


As indicated, for both specific/definite objects and non-specific/indefinite objects non- 
use of the marker (Zä is motivated by the same considerations. Kilian-Hatz (2008) does 
not make it clear why this should be the case; in fact, it is doubtful for two reasons. 
First, for specific nouns absence of the marker is rare according to Table 8, suggesting 
that use is the norm (as stated specifically in Kilian-Hatz 2013: 372). Being so predom- 
inant, use is unlikely to be associated with specific contexts or meanings. Second, for 
non-specific/indefinite nouns presence and absence appear less skewed in distribution 
- although Kilian-Hatz (2013: 371) says that the marker is used in most cases — and one 
would expect both to be motivated and meaningful. Nonetheless, no motivation is sug- 
gested for the use of the marker on such objects. 

To begin, the non-use of (2)a will be examined. Examples cited in Kilian-Hatz (2008; 
2013) reveal that the two circumstances of non-use cited in the above generalisations 
are at best statistically correlated with non-use. There are examples in which objects are 
marked by (7)à alongside of other NPs that are also marked by (?)à — see examples (34) 
and (35) in Kilian-Hatz (2008: 52). According to Kilian-Hatz such examples tend to be 
found in elicitation rather than in actual discourse; this is, however, a tendency and not 
a rule. 

Similarly, the association of non-use of (7)à with contexts in which there is no doubt 
as to who is acting on who is not consistently borne out: (23) is an example in which 
an object is marked by (?)à and there is clearly no real possibility of confusion as to 
who/what is acting on who/what. 
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(23) Khwe (West Kalahari Khoe; Kilian-Hatz 2013: 372) 
lóe-he átaxa  t'á-à-té ce-dji à 
child-3sc.r thus ^ eat-Arv-PRs  bush:species-3PL.F ACC 
“Thus the child eats the ce fruits? 


Nor does the possibility of confusion as to who is acting on who consistently engender 
the presence of (lg on specific object NPs, as shown by (24a) and (24b). In both of these 
examples it would seem that there is a genuine possibility of confusion as to who is 
acting on who, though no instance of the marker (7)d is present. 


(24) Khwe (West Kalahari Khoe; Kilian-Hatz 2013: 372) 
a. ó-l'áü-khoé-té té khóé-té xà-má kx'ó 
PRV-hair-person-lPL.C 1PL.C person-ÍPL.C DEM-3SG.M_ eat:meat 
‘He has to eat us, the ones without fur? 


b. tan tii tcá tí ú 
stand:up then 2sG.M sc bring 


“Stand up, then I may take you? 


Just a few examples cited in Kilian-Hatz (2008; 2013) - one of which is (25) - illustrate 
the contrastive function of (2)a on direct object NPs. In (25) the rock monitor and genet 
have already been mentioned and are represented by definite NPs (marked by a PGN 
marker). However, only the former is marked by (7)à: Kilian-Hatz (2013: 372) comments 
that the genet is the overall discourse topic of the narrative, presumably accounting for 
the absence of the marker (7)a on this NP in the second clause. 


(25) Khwe (West Kalahari Khoe; Kilian-Hatz 2008: 372) 
tínü  córo-md-d Igàa-khoe-dji nlgóá-à-te 
then rock:monitor-3sc.M-acc female-person-3PL.F cook-ATV-PRS 
kx "4-khoe-le tcámba-ma  nIgóá-a-té 
male-person-1PL.M  genet-3sG.M  cook-ATv-PRS 
"Ihen the women are cooking the rock monitor, and we men are cooking the 
genet. 


A better explanation of the situation in Khwe is possible within McGregor’s (2010; 
2013) theory of optional case marking. First, as mentioned above the acc marker is al- 
most always present on specific direct objects, and is unlikely to convey meaning. In this 
context only non-use of the acc marker conveys meaning; this must be to background 
the direct object. In examples such as (23) and (25), then, the presence of the (2)à marker 
on the direct object NP does not serve to foreground it, or to disambiguate it from the 
subject, but rather indicates nothing particular: the object is simply an object. It is the 
absence of the marker on the direct object of the second clause in (25) that is meaningful, 
and serves to background it - consistent with the fact that it is the primary discourse 
topic, and thus a good candidate for something presumed, for something that is under- 
stood as a component of the common ground at that point in the interaction. Certainly 
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contrasts are apparent in this example: between the women and men as subjects and the 
rock monitor and genet as direct objects. But it is not the use of (7)d that signals them; 
it is presumably some other features of the utterance (assuming that the contrasts are 
actually marked rather than inferred). 

Second, for pronominal objects and indefinite objects, as suggested above, it seems 
that use and non-use of (?)à are more evenly distributed. However, in the absence of sta- 
tistical data and a comprehensive examination of examples in the sources it is impossible 
to be certain whether both are meaningful. Indeed, it may be that further distinctions 
need to be made according to e.g. inherent features of the NP (animacy) or word order. 

Third, if (as Kilian-Hatz 2013: 371 avers) the marker is used on most indefinite direct 
objects, it is likely that non-use on them is meaningful, and serves a backgrounding 
function. This is consistent with the absence of (lg on an object comprising a very long 
list of types of things and an object with an abstract type referent (respectively, examples 
(471a) and (471b), Kilian-Hatz 2013: 371-372). Use of the marker could also be meaningful, 
though this is less certain. 


3.4 Concluding remark 


In most Khoe languages accusative marking of direct object NPs is optional, at least in 
certain environments. Little is known for certain concerning the factors that motivate 
use vs. non-use of the accusative marker in its environments of optionality, or the mean- 
ings that are expressed. Nonetheless, it is likely that the theory of McGregor (2010; 2013) 
can account for the facts in the various languages. Admittedly, much more research on 
each Khoe language is necessary to make a convincing case. For present purposes, it is 
sufficient to observe that the theory suggests that presence and/or absence of the post- 
position concern joint attention and grounding. 


4 The emergence and development of optional accusative 
marking 


Before beginning the exposition of my proposals for the development of optional ac- 
cusative marking in Khoe languages some cautions are in order. First, as has already 
been mentioned, descriptions of most Khoe languages lack comprehensiveness in their 
treatment of (-)(7)à, including motivations for its use and non-use. Second, diachronic 
data of any significant depth is non-existent: there is no long tradition of writing in any 
Khoe language or of linguistic investigation going back very far in the past. Third, as 
Fehn (2014: 319-320) rightly observes, serious problems lie in the low distinctiveness of 
the form (-)(7)à — and hence the probability of spurious cognates and look-alikes - to 
say nothing of the irregular presence of the initial /?/ in the sources. These problems 
bedevil grammaticalisation investigations of "exotic" languages, for which descriptions 
are frequently partial, and historical depth is lacking in the data; moreover, grammemes 
are often phonologically reduced and/or show phonologically unmarked shapes. 
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These concerns do not mean that one should avoid the domain of grammaticalisation. 
But they do imply that one needs to constrain these hypotheses. One way is to invoke 
attested pathways as far as possible, especially pathways that have empirical evidence in 
actual diachronic data. Another is to compensate for lack of time depth with synchronic 
diversity. Thus the need for information on the relevant forms and their functions in a 
diverse sample of languages in the family or in the geographical region. Even with these 
constraints, the proposals ultimately remain speculative, albeit hopefully plausible. 

Before outlining my proposals for the grammaticalisation of accusative marking in 
Khoe languages in 84.2 I briefly overview the few existing suggested scenarios. 


4.1 Overview of existing proposals 


The Khoe literature contains just a few suggestions for the grammaticalisation of the 
accusative (-)(7)à. None of these explain how (-)(?)à became an accusative marker, or why 
itis optional in Shua, Ts'ixa, Khwe and !Ora. It is useful to overview these proposals and 
draw out their inadequacies so that the proposals I outline in $4.2 can be tested against 
these weaknesses. 

The only detailed scenario for the grammaticalisation of (-)(7)à is that proposed in 
Kilian-Hatz (2008: 55, 2013: 376-378) for Khwe.** These two sources describe effectively 
the same diachronic scenario, albeit with some differences in detail and in the degree of 
elaboration of certain points. Both proposals are based squarely on the situation in Khwe, 
and although they might be extended to other Khoe languages, emendations would be 
necessary to account for the different endpoints in the various modern languages. 

The scheme in (26) shows the grammaticalisation scenario proposed in Kilian-Hatz 
(2008: 55), while (27) shows the version presented in Kilian-Hatz (2013: 376-377). Ac- 
cording to (26), the development of the genitive function of Ga is independent of the 
development of the object marking functions, and this is left out of (27). This part of the 
story is of no concern here. 


BKónig (2008: 276-278) briefly outlines a scenario similar to Kilian-Hatz's, involving a change from copula 
to focus marker to accusative marker, which she says is applicable to both Khwe and Khoekhoe (see (26) 
and (27) below). On the other hand, Haacke (2013b: 342) suggests in a parenthetical aside that -à in !Ora is 
"derived from the stative aspect marker". He provides no discussion or evidence for this suggestion, which 
is presumably based on formal identity or similarity of the oblique suffix with a stative aspect marker -a. 
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genitive (indefinite/unspecific) 


A 


copula/presentative (indefinite/unspecific) 


y 


subject focus (indefinite/unspecific) 


Vv 


direct object focus 


y 


indirect object focus 


Vv 


adverbial focus 


COP copulative/presentative use of à, which is restricted to indefinite subjects in 
verbless clauses 

FOC “becomes a focus marker that introduces new information”: indefinite 
subjects, indefinite objects 

O extends to new but definite objects (a) to indicate contrastive focus; (b) to 
disambiguate syntactic roles 


OBL extends to focus on local and temporal adverbials (apparently NPs) 


According to both (26) and (27) (lg was initially a copula/presentative marker that 
was restricted to verbless clauses with indefinite or unspecified subjects, in which case 
it followed the subject NP. Reflexes of this putative initial state are found in the modern 
language, as shown by (28)-(29) (see also (7a) and (7b) above). 


(28) 


(29) 


Khwe (West Kalahari Khoe; Kilian-Hatz 2008: 135) 
yl á léü 
tree coP big 


“The tree is big 


Khwe (West Kalahari Khoe; Kilian-Hatz 2008: 99, 208) 
nlii dáó à Fó dáó à 


DEM? path roc small path cop 


‘This path is a small path’ 


“Kilian-Hatz usually glosses à in example (28) as Foc, not cop (she is inconsistent in example (29), glossing 
it as COP on p. 199, but as Foc on p. 208). What she normally glosses as COP is à in final position in the 
relational clause, as in the case of the second instance of à in (29). In keeping with the remarks of footnote 
2 above, I employ the gloss cor for those cases in which the marker appears to be serving as a copula. 
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From marking indefinite/unspecified subjects of copula relational clauses (see (26)) 
(Gë extended to marking indefinite/unspecified subjects of ordinary verbal clauses, then 
to marking direct objects, and ultimately to marking indirect objects. One difficulty with 
this scenario is that it predicts that the frequency of (2)à should be highest on the oldest 
usage, subjects, and lowest on the newest, indirect objects. In fact, the newer uses of 
the focus marker are more frequent than the most established one. Indeed, as has been 
remarked, (?)à occurs on almost all indirect object NPs, making it improbable that it 
assigns focus to indirect objects. 

(27) gets around this difficulty by presuming that in its first stage of development (7)à 
became a general focus marker that was not specifically associated with any grammatical 
role, but marked NPs that presented new information. From this, it seems that the focal 
value of (?)à began to change somewhat, so that on object NPs it came to be associated 
with contrastive focus and disambiguation. There are a couple of things that are not 
accounted for in this scenario. 

First, the change from focus marker to contrastive focus marker is a restriction and 
strengthening of the focal value of the marker, and seems an unlikely change for a focus 
marker - one expects a focus marker to weaken over time, not to strengthen. How this 
relates to the disambiguation function is not made clear, though in this instance it is plau- 
sible that the focal value has weakened. This scenario thus invokes both strengthening 
and weakening of the focal value of the marker. 

Second, it is not explained in this story why this change happened with direct object 
NPs but not with subject NPs. Indeed, neither (26) nor (27) accounts for the strong as- 
sociation of (7)à with object NPs in Khwe (or any other Khoe language) - recall that 
about two thirds of object NPs in Khwe are marked by this postposition; NPs in other 
grammatical relations are far less frequently marked.!* 

Third, (27) presumes an initial association of (7)à with indefinite NPs and new infor- 
mation, as expected for a focus marker. And indeed as can be seen from Table 8, this 
association is manifest in Khwe for subject NPs. However, for direct object NPs the sit- 
uation is inconsistent with the initial association: definite proper and specific ones are 
almost always marked by Ca. whereas for indefinite ones (lä remains optional. It re- 
mains unexplained why the marking of indefinite direct object NPs did not become more 
entrenched and frequent than the marking of definite direct object NPs, given that the 
former represents the older and less marked situation. One expects under scenario (27) 
that on definite NPs occurrence of the postposition (7)à would have been more restricted 
and infrequent. 

Finally, it is unclear why (in both (26) and (27)) it is only in the final stage that (lg 
comes to be used to mark spatial and temporal locatives. To be sure, in this instance the 
low frequency of use of the postposition is consistent with the late development of the 


Note that in this example Kilian-Hatz (2008: 208) treats the subject as having an indefinite head noun (since 
it is not marked by a PGN marker), even though there is a demonstrative in the NP. 

!6Kilian-Hatz (2013: 356-357) provides some relevant figures showing the strength of the association of (?)à 
with the object role, based on a corpus of some 1,500 sentences from a set of 30 texts. In this small corpus 
29 object NPs are marked by (2)à (i.e. almost 80%), 8 are not marked; no transitive subject NPs are marked. 
(No figures are given for intransitive subject NPs.) 
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function. But it remains unclear why - if it had really begun as an unrestricted focal 
marker — (?)à would not have been used generally on clausal units, regardless of their 
grammatical role. 

Kilian-Hatz (2013: 378) attempts to flesh out details of the development of Ca from 
a pragmatic marker of focus to a “more grammatical" marker of object, as per (30). She 
does not indicate, however, how precisely this sequence of steps fits with that proposed 
in (27). 


(30) focussed referent precedes main clause in copulative periphrasis with à, and 

object referent identical with that of main clause - O à, SOV 
reinterpretation of copulative periphrasis as focussed and topicalised object of 
main clause - O à SV 


y 


reinterpretation of focus marker a as combined focus-object marker 


expansion of focus-object marker to objects in SOV and SVO clauses (SO a V, 
SVO a) 


Again, a number of stages proposed here lack motivation. To begin with, why didn't the 
story begin at the second stage? An obvious motivation for this might be to mark an 
object that occurred in a marked order with respect to the subject. Second, given that 
at the same time (7)à marked indefinite subjects, which also typically occurred prever- 
bally, why should (7)à have come to be strongly associated with object NPs? Third, this 
additional scenario still does not account for the most serious difficulty of (26) and (27), 
namely how and why (?)à extended to definite direct objects, and why it is so frequent 
on them. 


4.2 An alternative proposal 


I begin by reconstructing in broad brush a set of probable diachronic changes leading to 
the core case-markers of modern Khoe languages; these will be discussed and elaborated 
further below. It should be noted that this is intended to capture only a small part of the 
possible diachronic developments involving (-)(?)a; developments that do not pertain to 
the accusative marker as endpoint are largely ignored. For instance, it seems reasonable 
to presume that the locative postposition (lg of many Shua varieties developed from 
the same source as the accusative, albeit via independent processes. Also explicitly left 
out of the proposed diachronic scenario are those changes specific to (-)(?)d on indirect 
objects. 

It is reasonable to presume that the marker (?)à was initially a separate word. At some 
point in time it lost its freedom of occurrence when following the PGN marker, becom- 
ing a suffix to that marker. In the Khoekhoe lineage PGN markers became effectively 
obligatory on NPs, and the suffix was restricted to this environment. However, in this 
lineage the association with objects never became exclusive, and a small fraction of sub- 
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ject NPs retained the suffix — e.g. “deposed” subjects, and in Nama-Damara subjects in 
some marked illocutionary moods; moreover, in !Ora at least on object NPs the suffix 
never became entirely obligatory. 

In the Kalahari Khoe lineage different diachronic developments occurred. In contrast 
with Khoekhoe languages the PGN markers were used only on some NPs (perhaps defi- 
nite ones as in many of the modern languages), and it was only in this environment that 
the (7)a lost its freedom of occurrence. Ultimately the suffix was reinterpreted as a part 
of the PGN marker, leading to the development of two (sometimes three) series of PGN 
markers in the majority of languages. The d series was strongly associated with objects, 
and effectively became an accusative series in some languages (Ts'ixa, Gut, and Eastern 
Ani); in other languages (e.g. Nata Shua and Khwe) the association of the à series with 
subjects strengthened rather than weakened, and this series was ultimately used for NPs 
in both subject and object roles. 

Elsewhere, i.e. with NPs that were not marked by PGN markers, the (lä remained a 
free word. In most of the Kalahari Khoe languages this free morpheme became exclu- 
sively but optionally associated with non-PGN-marked objects, and is not employed on 
subjects at all. The next development that occurred in the Kalahari Khoe lineage was 
the extension of the marker (7)a from non-PGN-marked object to PGN-marked objects, 
presumably via reinforcement. In Khwe, where (lag does occasionally mark subjects, it 
is restricted to non-PGN-marked ones; no extension to PGN-marked subjects occurred. 

The above scenario focuses on the formal aspects of the grammaticalisation of the ac- 
cusative marker (-)(?)à. The schematic representations provided in Figure 2 and 3 show 
the main concomitant developments in usage across the two major lineages. The arrows 
show diachronic developments. However, not all of these changes can be located in a sin- 
gle chronological sequence with respect to one another, and hence two parallel pathways 
are indicated for each representation. It should be noted that not all of the diachronic 
developments occurred in all languages. 

The initial stages of Figure 2 and 3 are the same: (?)à is a free copula in presentational 
clauses that occurred after the NP presented to the addressee's attention, as illustrated 
in the Khwe examples (7a) and (7b) above. In these environments that free copula served 
an indexing function in the Peircean sense - cf. footnote 7 on the term copula. It is not 
unreasonable to presume that the same marker could also be used in ordinary verbal 
clauses to draw attention to an NP, indexing its presence and drawing the addressee's 
attention to it. Khwe example (31) illustrates this usage in one modern language. There is 
no reason to suppose either of these uses predates the other. In other words, the proposal 
is that (7)à began as an indexical word that served to draw attention to a referent entity, 
regardless of whether it occurred in a dedicated presentational clause or in an ordinary 
verbal clause. 


(31) Khwe (West Kalahari Khoe; Kilian-Hatz 2008: 220) 
ndée! tómtom-xó à ka tí à ` tóm-a-te? 
mum swallow-NMLz cop there 1sG ACC swallow-ATv-PRS 


"Mum, there is a swallowing thing [i.e. a python] that (wants to) swallow me! 
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free presentational copula; 
& used to draw attention to 


NPs in verbal clauses 


strong association with 
definite Os; 
weaker association with 
indefinite Os & other roles 


reanalysed as suffix to PGN 
markers; 
elsewhere remained free 


Y 


Y 


reinterpretation as optional 
O marker; 


PGN markers become 


occasional S marker obligatory in NPs 


N "4 


obligatorification of suffix in 
certain circumstances, 
usually on O NPs and 
marked S NPs 


Figure 2: Diachronic developments of (-)(7)à in the Khoekhoe lineage 


My proposed initial stage is somewhat reminiscent of the initial stage suggested in 
Kilian-Hatz (2013: 376-378) and Kónig (2008: 276-277). In all of the scenarios (-)(?)à began 
as a type of copula, though in Kilian-Hatz's and Kónig's accounts it was not specifically a 
presentative one. It was, however, restricted to verbless clauses with indefinite subjects. 
This became a focus marker that introduced new information, in particular indefinite 
subjects and indefinite objects. By contrast, I take the presentative use — an attentional 
resource that permits the speaker to direct attention to something so that it comes to 
occupy the centre of the joint attentional frame (Tomasello 2003) — across both verbal 
and verbless clause types to be the original source stage for the diachronic changes; how 
that relates temporally with the use of (7)à as an attributive or identifying copula is 
not clear to me, and is irrelevant to my scenarios for the development of the accusative 
marker. 


17Elsewhere, the same example is given a different free translation, “Mom, there is a swallowing thing here; 
it swallows me!" (Kilian-Hatz 2008: 250). Given the discussion of the previous page (Kilian-Hatz 2008: 249), 
this is inappropriate, and the monoclausal free translation given in (31) is preferable. 
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free presentational copula; 
& used to draw attention to 
NPs in verbal clauses 


a 


strong association with 
definite Os; 
weaker association with 
indefinite Os & other roles 


y 


reinterpretation as optional 
O marker; 

& optional S marker in some 

languages 


free morpheme optional on 

indefinite O NPs; 
precluded on definite O & S 
NPs 


Y 


free morpheme extended to 
PGN marked O NPs 


free morpheme increased 
frequency on PGN marked O 
NPs until (almost) obligatory 


free morpheme reanalysed as 
accusative marker; 
remained optional in many 
circumstances 


i 


reanalysed as suffix to 
PGN markers; 
elsewhere remained free 


obligatorification of bound O 
marker; 
obligatorification of bound S 
marker (some languages) 


Y 


suffix fused with PGN 
markers, forming separate 
series 


PGN series associated 
with Os; 
& with Ss in some languages 


Figure 3: Diachronic developments of (-)(?)à in Kalahari Khoe lineages 
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A crucial feature of Figure 2 and 3 is that the indexing function of (-)(7)à was not 
restricted to NPs in any particular grammatical relation. Nonetheless, from early on, at 
least from the second stage, it was strongly associated statistically with certain types of 
object, specifically definite objects — not indefinite ones, as per Kilian-Hatz (2013: 376- 
378) - less strongly with indefinite objects and intransitive subjects, and perhaps even 
less with transitive subjects and locatives. What motivated these initial associations? 

In essence, my answer is that NPs were indexed and drawn to the addressee's attention 
when they were unexpected for some reason. One circumstance in which unexpected- 
ness emerges is when the NP in the grammatical role does not fit with the prototype for 
the role. For the two core grammatical relations in transitive clauses the prototypes may 
be assumed to be something like the following: 


e Transitive subjects (Agents) are prototypically given (presuming my reformulation 
of DuBois's (1987) given A constraint, McGregor 1998), animate, and definite; 


* Objects (Undergoers) are prototypically new, inanimate, and indefinite (e.g. Com- 
rie 1979). 


For intransitive subjects I presume no corresponding prototypical features: NPs in 
this role are not strongly associated with any particular givenness, animacy, or definite- 
ness values. Thus different intransitive clause types are associated with different norms 
on these dimensions. This means that for intransitive subjects unexpectedness must be 
based on considerations other than not matching a prototype. For NPs in this role either 
only the local discourse consideration that it is informationally new or indefinite is rel- 
evant to the evaluation as unexpected, or (if given and/or definite) the unexpectedness 
relates to the identity of the filler of the role — some other entity being expected in the 
role. 

What may have happened in the early stages of the scenarios in Figure 2 and 3 is 
that object NPs were marked as unexpected primarily when definite, when they failed 
to match this component of the role prototype. Ultimately, all or the majority of definite 
objects came to be marked by (Ié? If, in these early stages, PGN markers were markers 
of definite, i.e. identifiable, NPs (as in some modern Kalahari Khoe languages, e.g. Khwe 
- Kilian-Hatz 2008: 43, Ts'ixa - Fehn 2014: 63, 74), the strong affinity of (7)à with PGN 
markers can be accounted for. For these NPs marking by (?)à became obligatory or almost 
obligatory, and this fed into the development of the marker into a suffix, and ultimately 
to loss of its separate status as a morpheme and its incorporation into the PGN forms of 
one ofthe series in Kalahari Khoe languages. This series is the one that is in all languages 
associated with NPs in object roles. In Khoekhoe something different happened: the PGN 
markers generalised to all NPs regardless of definiteness, and the -à suffix went with it 
on all object NPs by extension. As already remarked, in Nama-Damara it seems that -à is 
obligatory on objects; it may be optional in !Ora, but no information is available on the 


181 presume that this was a gradual process, beginning with only some definite objects being marked, and 
that the frequency of marking increased over time. However, a rapid, virtually instantaneous event cannot 
be ruled out. Both are consistent with the proposed scenarios. 
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conditions of its use and non-use, and without relevant data it is pointless to speculate 
on its development. 

The situation for indefinite NPs was initially quite different, and remained different 
in Kalahari Khoe languages where PGN markers did not generalise to all NPs. Indefi- 
nite NPs satisfy the relevant component of the prototype, and their unexpectedness — 
and thus marking by the optional (lä presentative index — could only be based on lo- 
cal considerations relating to the discourse context. These local considerations concern 
information status (e.g. whether new or contrastive) on the one hand and the extent to 
which it satisfies the patientivity profile prototypically associated with the role (whether 
it is more patientive than normal) on the other. The result was that in Kalahari Khoe 
languages the (Zä continued to be a free word with non-PGN-marked NPs, where it 
remained optional on object NPs. Reflection of these considerations remains in the mod- 
ern languages, where, as seen in $3, the motivations for use or non-use of the accusative 
marker differ across the languages: information status is in all languages a relevant vari- 
able; patientivity profile is documented as a consideration only for Shua. 

The overall preference of (-)(?)à on objects was further skewed by its infrequent occur- 
rence on subject NPs. In Khoekhoe, marking of subject NPs became restricted to certain 
marked syntactic environments, such as on “deposed” subjects. As a result, -à was ul- 
timately interpreted as an oblique suffix (as per the analysis of Haacke 2013b: 341). In 
Kalahari Khoe languages the -à series of PGN markers became the one that was con- 
sistently associated with objects; in addition, in some languages it was associated with 
subjects, presumably through extension from the occasional uses of the marker on PGN- 
marked subjects. When the free (lä was extended to definite NPs it was only to those 
in object roles. In those languages like Khwe where the free (7)a also occurred on indef- 
inite subjects (almost invariably intransitive), there was no corresponding extension to 
definite subjects. Thus the occurrence of (Ga on subjects in Khwe is a relic of the origi- 
nal indexical-presentative function; it is not a later extension of the marker to subjects. 
In East Kalahari Khoe languages this use either never arose or completely disappeared. 
The strength of the association with objects resulted in reinterpretation of (7?)à as an 
accusative marker in Kalahari Khoe languages. 

It is important to observe that it was definite NPs that overwhelmingly tended to be 
marked by (-)(?)à; indeed, in many Kalahari Khoe languages, they are etymologically dou- 
ble marked. Two staged sequences of grammaticalisation of (-)(?)à were involved with 
definite NPs, one resulting in the fusion of à with PGN markers, the other involving the 
expansion in usage of the free reflex of (-)(2)à (indicated by the greyed boxes in Figure 3). 
Both were motivated by the fact that definite NPs failed to match the prototype for ob- 
jects. At some point in the first sequence only direct object NPs that were indefinite (i.e. 
non-PGN-marked) were marked by the free (7)d. This situation is highly marked and 
unusual in that more prototypical objects are morphologically more marked than less 
prototypical ones. The extension of the marker to definite direct object NPs may have 
been driven by this disparity. It is likely that at the beginning of the second sequence, 
as of the first, the marker was a presentative index, and that it was only subsequently 
reanalysed as an accusative marker. Once established as the norm for definite NPs, no 
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longer would the presentative meaning be associated with (7)a. The loss of its presenta- 
tive meaning may have been what ultimately led to the reanalysis of (7)à as an accusative 
marker on definite NPs: without its presentative meaning and with the definite meaning 
being marked by the presence of the PGN marker, the only meaning available in this cir- 
cumstance for the morpheme (?)à was accusative. Subsequently this reanalysis extended 
to indefinite NPs as well, where the free (7)à was reanalysed as an optional accusative 
marker, as shown in the final stage of Figure 3. 

The reinterpretation of (lg as an accusative marker was concomitant with its loss of 
its inherent presentative value. What happened at this stage was that what was strongly 
associated with (lg came to be interpreted as its coded meaning; correspondingly, the 
coded presentative sense was lost. Simultaneously with this, meanings became associ- 
ated with the use and/or non-use of the accusative marker, which was optional at least 
on indefinite NPs. As per my theory of optional grammatical marking (McGregor 2013), 
the meanings that could be associated with usage and/or non-usage of the marker were 
restricted to different values of the features [prominent] and [backgrounded] and their 
combinations. This process of reinterpretation involved no significant meaning change, 
as (7)à-marking of object NPs already presented the NP to the addressee's attention. In 
short, the processes involved at this point are well known processes in grammaticali- 
sation, the replacement of coded meaning with habitually associated meaning. In the 
present case both presentative and case meanings remained, albeit in somewhat mod- 
ified forms. The meanings also changed their loci of expression: presentative meaning 
- in the revised form [+prominent] - became associated with usage of the morpheme, 
while the case meaning habitually associated with the morpheme took the place of its 
coded meaning. 

An important difference between the two sequences involved in Figure 3 is that in the 
first one marking by (lg was strongly associated with definite NPs from the beginning, 
whereas in the second sequence the marking was initially most strongly associated with 
indefinite NPs. The former situation is contrary to the scenario of Kilian-Hatz (2013: 
376). It might seem that Kilian-Hatz's initial stage is more in keeping with the use of 
a presentative marker, which presumably generally serves to introduce new items into 
the discourse. Two observations attest to the plausibility of my interpretation. First, I 
would agree that introduced items typically present new information, information that 
is not retrievable from the previous discourse. However, in that it indexes the entity, the 
marker presents the item as identifiable by the addressee, namely the target of the index. 
The situation in the second stage is as assumed by Kilian-Hatz (2013). But its foundation 
is quite different from that assumed in Kilian-Hatz (2013: 376). It is not because of an 
association of a focus marker with indefinite NPs, but rather a consequence of reanalysis 
of the PGN morphology that was associated with definite NPs. Second, the presentative 
marker was used to draw attention to something, to single it out as noteworthy, and not 
necessarily to introduce it. In general one can expect that a speaker will draw attention 
to something when there is something unusual or unpredictable about it. This may be 
that it is assumed to be unknown to the hearer, and needs to be presented to them; but 
there are other reasons that concern not the identity of the thing, but e.g. whether it 
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matches the prototype for the grammatical role. In other words, what the addressee's 
attention is drawn to need not necessarily be the presence of the entity in the context. 

In Kalahari Khoe languages the strengths of the associations of (?)à with object NPs 
vary across the languages and within them according to the circumstances, as seen in $3. 
In Shua the strongest associations are with NPs very high on the animacy hierarchy, 
including pronouns and personal names, where (7)à is (almost) obligatory. A similar 
thing happened in Khwe, although somewhat unexpectedly (lg is obligatory on per- 
sonal names but not on pronouns. In Ts'ixa definiteness and word order seem to have 
been the major factors (see Table 7 above). In Kalahari Khoe languages obligatorifica- 
tion of (?)à on object NPs remained local and restricted, unlike the situation for the à 
series of PGN markers (obligatorily chosen if a PGN was used on the object NP) and -à 
in Nama-Damara. Elsewhere (7)à remained optional. Nonetheless, there were evidently 
statistical differences in the frequency of usage of (7)à depending on these factors, and 
corresponding differences in motivations for presence vs. absence of (Gg, 

Let us see how the modern situations might have arisen historically. Here I outline the 
three relevant scenarios, linking them to actual situations in the modern languages. I do 
not attempt to account for the situations in the modern languages - impossible given 
the present state of knowledge. 

In contexts in which the marker was infrequently used on object NPs use of (7)á simply 
took the value of the morpheme as an attention-director, while no meaning was associ- 
ated with its absence, the normal and unmarked condition. The expression-locus of the 
presentative meaning shifted from the morpheme itself to its use. This is the situation 
for inanimate and lower order animate object NPs in Shua. 

Where neither use nor non-use of (Gd was strongly dominant, the same shift in the 
expression-locus of meaning could have occurred, and use could still be associated with 
prominence as an attention director (although this may have been a somewhat reduced 
type of prominence vis-à-vis the initial state where the marker was rarely used). At the 
same time, as non-use of the marker became a less frequent choice, and this choice be- 
came more restricted, non-use could have begun to acquire a meaning. When use and 
non-use had become roughly equal in frequency the contrast between them was liable 
to be reinterpreted as an equipollent one, in which neither is marked with respect to 
the other. In this circumstance, rather than carrying a complementary meaning to use, 
non-use conveyed a qualitatively different meaning. According to my theory of option- 
ality, there are restrictions on what this new meaning can be: it must be [backgrounded] 
(McGregor 2013). Thus one arrives at the situation represented in the final column of 
Table 5, which may be the situation for PGN-marked NPs in Shua (see Table 6). 

Where the frequency of use of (7)à was or became high, as on non-prototypical ob- 
ject NPs such as pronouns in Shua and PGN-marked definite NPs in Khwe, the original 
attention-directing value of the marker would be completely lost with the high degree 


1 At this stage then for NPs of the specified type the speaker is forced to choose between foregrounding 
and backgrounding the object NP. There is no option of conveying a neutral meaning or a (strongly) focal 
meaning. If such meanings are desired, then other means of expression might be chosen by the speaker, e.g. 
expression by a pronominal rather than a lexical NP or use of another focal strategy such as word order 
or intonation. 
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of usage. Assuming the increase in frequency of use was a gradual process, the mean- 
ing associated with non-use as per the previous paragraph would have been retained. In 
such circumstances only non-use of the marker would be meaningful, as in the case of 
personal name objects in Shua. The association of meaning with non-use of (7)à is not 
however dependent on gradual increase in the use of the marker. The same process as 
invoked in the previous paragraph could account for a meaningful non-use even if this 
arose virtually instantaneously. 

To wind up this discussion, it is worth drawing a brief comparison with the grammat- 
icalisation of optional ergative case-markers in Australian languages; this lends some 
credibility to the proposed grammaticalisation scenario for Khoe languages. A number 
of Australian languages exhibit, as suggested by e.g. McGregor (2010; 2013; 2017), the 
association of a type of focal marker with transitive subject NPs, where the focal marker 
was originally an indexical element. This is a plausible source for the optional ergative 
marker in some Australian languages. The critical grammaticalisation processes here 
are essentially the same as involved in the development of the optional accusative in the 
Khoe family: highlight and draw attention the unexpected and/or non-prototypical. The 
differences concern on the one hand which of the two roles of transitive clauses was 
selected for this special attention, and on the other the nature of the erstwhile indexical 
element - a presentative copula in Khoe languages, often a determiner or pronominal el- 
ement in Australian languages. Furthermore, it is noteworthy that in both Australia and 
southern Africa evidence of the earlier attention-directing meaning remains in some 
languages in the otherwise inexplicable occasional use of the marker on subjects of in- 
transitive clauses. 


5 Conclusions 


I have suggested that — despite the cautions rightly voiced by Fehn (2014: 319-320) - 
it is possible to propose a viable scenario for the emergence and development of the 
marker (-)(?)à as an accusative marker in Khoe languages. This scenario is preferable 
to the proposals of Kilian-Hatz (2008: 55, 2013: 376-378). It postulates an initial state in 
which (lä was a presentative copula, and traces its development into the final vowel of 
a set of PGN markers that are consistently associated with NPs in the object role and an 
optional accusative marker in most Kalahari Khoe languages, and into an oblique suffix 
in Khoekhoe. 

I have discussed the ranges of uses of (-)(?)à across the Khoe family in as much detail 
as possible given present knowledge and limitations of space, in the belief that — in cir- 
cumstances such as those that Khoe languages find themselves where time depth is seri- 
ously lacking - a motivated diachronic scenario requires a broad spectrum of synchronic 
variation. I have also as far as possible attempted to motivate stages and developments 
among them through reference to other documented processes of grammaticalisation — 
in the present instance, primarily to development of optional ergative case marking (e.g. 
McGregor 2010; 2013; 2017). 
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Much more work needs to be done aside from the above-mentioned need for careful 
synchronic investigations of the motivations of optional accusative marking in Khoe lan- 
guages. First, I have ignored the dimension of word order, which is likely to also be a 
significant factor in the grammaticalisation of (-)(?)à. This awaits more detailed investi- 
gations of word order in most modern Khoe languages. Second, my scenario focuses on 
grammaticalisations of (-)(7?)à to a marker of direct object NPs. I have not included in 
the diachronic story its role as a marker of indirect objects. Contrary to the assertions 
of Kilian-Hatz (2013: 373), (lg behaves in a very different way on indirect objects to 
direct objects, and it is not obvious how the account of the grammaticalisation of (lg 
as an accusative marker on direct objects should be extended to account for its use on 
indirect objects. Nor have I addressed the development of the genitive, attributive and 
identifying copula, and other functions of (-)(?)à found in Khoe languages. 
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Abbreviations 
ABL ablative IRR irrealis 
ACC accusative J juncture morpheme 
ADV adverbial LOC locative 
AG agentive nominalisation M masculine 
ALL allative NEG negative marker 
APPL applicative NMLZ  nominaliser 
ATV non-past active NP noun phrase 
c common gender NPST near past tense 
CAUS causative O direct object 
comp  complementiser OBL oblique 
CON] conjunction PASS passive 
COP copula PER perlative 
DECL  declarative particle PGN person-gender-number 
DEM demonstrative (marker) 
DOM differential object marking PL plural 
DU dual POSS possessive 
F feminine POT potential 
Foc focus PROG progressive 
GEN genitive PRS present 
GN geographical name PRV privative 
HAB habitual PST past 
IDTF identified REFL  reflexive 
IPFV ` imperfective RPST remote past tense 
IND indicative S subject 
INS instrumental SDPST same day past 
IO indirect object SG singular 
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The rise of differential object marking in 
Hindi and related languages 


Annie Montaut 


INALCO, Paris (SEDYL UMR 8002, CNRS/INALCO/IRD), Labex 083 (Empirical Founda- 
tions of Linguistics) 


Differential object marking (DOM), which involves a contrast between zero marking and 
accusative marking by means of an originally dative postposition, appeared in Indo-Aryan 
languages only a few centuries ago as opposed to Dravidian languages which had it right 
from the earlier attested stage (1% century) and have a specific accusative marker. Hindi like 
other Indo-Aryan languages uses the dative postposition to mark this specific accusative, a 
postposition which appeared at around the same period for marking experiencers. It is now 
required with human objects with very few exceptions, and optional with inanimate objects 
even when definite and individuated. But the historical evolution ofthe marking shows that 
the prevalence of animacy over definiteness is quite recent. The paper is an attempt to find 
explanations for this evolution, which only partly corresponds to the scenario put forward 
by Aissen (2003), according to which the obligatoriness of marking develops by extension 
from an initial kernel of marked objects. The paper will first analyze the properties and 
range of DOM in Modern Standard Hindi (semantic, discourse related, particularly topic 
related, and syntactic ones; $2 and 83), a fairly well explored topic. I will then inquire into 
the historical emergence of DOM (84), and its presence in non-standard varieties or “dialects” 
(85), both topics far less studied. Finally it will suggest some hypotheses on the emergence 
and grammaticalization of the marked accusative in Hindi and related dialects (86). 


1 Introduction 


Differential object marking (DOM), which involves a contrast between zero marking and 
accusative marking by means of an originally dative postposition, is a relatively new 
phenomenon in Indo-Aryan languages (Masica 1982) as is the rise of dative experiencer 
subjects, both expressed with the dative marker. This contrasts with Dravidian languages 
where DOM is attested since the earliest texts, with a specific accusative marker. It is 
obligatory in Hindi only with human individuated objects, and optional with inanimate 
objects even when individuated. However, an inquiry in the historical evolution of the 
marking shows that the supposed prevalence of animacy over definiteness is quite recent. 


Annie Montaut. The rise of differential object marking in Hindi and related 
languages. In Ilja A. Serzant & Alena Witzlack-Makarevich (eds.) Diachrony 
| of differential argument marking, 253-282. Berlin: Language Science Press. 


Annie Montaut 


The aim of this paper is to attempt to find explanations for this evolution, which only 
partly corresponds to the scenario put forward by Aissen (2003), according to which 
the obligatoriness of marking develops by extension from an initial kernel of marked 
objects. The paper will first analyze the properties and range of DOM in Modern Standard 
Hindi (semantic, discourse related, particularly topic related, and syntactic ones; 92 and 
83), before looking at the historical emergence of DOM (84) and its presence in non- 
standard varieties or “dialects” (85), and suggesting some hypotheses on its emergence 
and grammaticalization (86). 


2 Basic facts in modern Hindi DOM 


DOM is largely grammaticalized in Modern Standard Hindi, where identified objects are 
both case marked and can trigger a change in verb agreement: in ergative constructions,! 
as well as in passive constructions, the verb agrees with an unmarked patient, but not 
with a marked patient. DOM is constrained first by the semantic or inherent proper- 
ties of the argument (obligatory overt marking), and secondarily by discourse related 
properties (optional marking). DOM occurs only with formally transitive verbs and for- 
mal transitivity is found only with verbs high on the transitivity hierarchy (Hopper & 
Thompson 1980; Tsunoda 1985), involving a binary relation between real agent and real 
patient. It follows that DOM occurs only with typical agents. In turn, marked objects are 
more sensible to topicality (Dalrymple € Nikolaeva 2011) than to, as suggested by Næss 
(2004), affectedness. As for what is often analyzed as syntactic properties of marked ob- 
jects, they ultimately can also be accounted for in terms of discourse related properties, 
such as topicality or saliency. 


2.44 Morphological properties: flagging and indexation 


The case marker is the postposition ko (suffixed to pronouns), the same which is also 
used for beneficiaries or experiencers, a syncretic case for dative/accusative. Example 
(1a) illustrates the obligatory marking of human objects (particularly proper nouns and 
personal pronouns) with no effect on agreement in the present, whereas (1b) illustrates 
the same marking with a verb showing default agreement (masculine singular) in erga- 
tive constructions (past transitive clauses), and in the non-promotional passive (1c). The 
contrast between agreement with unmarked objects (2b) and default agreement (2a) is 
found with inanimate objects: 


Hindi is a language with (aspectually) split ergativity: larke ne film dekhi [boy.w.sc.oBr ERG film.F.sG 
see.F.sG] “The boy saw the film’ vs. larka aya [boy.M.sc come.3sG] "Ihe boy came’, larka aega [boy.m.sG 
come.FUT.3M.SG] “The boy will come’. Examples are from everyday exchanges or my own when not other- 
wise indicated. 
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(1) Modern Standard Hindi (own data) 


a. mai tumko / Ram ko / apni bett ko dekh rahaà 
1$G — 2.ACC Ram Acc REFL daughter ACC see  PROG.M.SG 
hü 
PRS.1SG 


“Tam looking at you /Ram / my daughter. 


b. maine tumko / Ram ko / apm bett ko kal 
1SG.ERG — 2.ACC Ram Acc REFL daughter Acc yesterday 
nahi dekha 


NEG  See.PFV.3M.SG 

‘I did not see you /Ram / my daughter yesterday: 
c. donó admiy6 | ko dekha gaya 

the.two  man.M.PL ACC see PASS.PST.M.SG 


“Both men were seen’ 


(2) Modern Standard Hindi (own data) 


a. maine is film ko | dekha 
1SG.ERG DEM movie.F.SG ACC  see.PFV.3M.SG 
b. maine yah film dekhi 


1SG.ERG DEM  movie.F.SG  see.PFV.3F.SG 


‘I have seen this film. 


2.2 Type of arguments: Animacy, definiteness, specificity 


Since the role played by the semantics of the verb as suggested in Mohanan (1994: 81) 
can be seriously questioned (cf., inter alia, Self 2012 for an overview), and given the 
limitations of this study, it will not be treated here. 

As in many languages, the animacy (human > animate > inanimate) and definiteness 
scales, into which specificity can be integrated (Croft 2003: 132) (Personal pronoun / 
Proper name > Definite NP > Indefinite specific NP > Non-specific NP) overlap, with 
an apparent prevalence of animacy: (3a) with an indefinite human object is obligatorily 
marked, and so are proper nouns referring to human objects, in contrast with those refer- 
ring to inanimate objects (3b). Pronominalized inanimate objects are more often marked 
than the corresponding nouns (3c). Example (3d) shows that the pronominalization of 
‘the note’ does trigger the accusative marking, whereas the same noun (‘the note”) occurs 
thereafter in the unmarked form: 


(3) Modern Standard Hindi (own data) 


a. kisi ko bulao! 
INDEF Acc calliwP 


“Call somebody" 
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b. maine Dilip ko  (*Dilip) dekha | / maine Kalkatta dekha 
1SG.ERG Dilip Acc (Dilip)  see.PFvV 1SG.ERG Calcutta  see.PFv 
‘I saw Dilip / 1 saw Calcutta’ 

c. kot pita bhi ise (is — bat ko / yah bat) 
INDEF father even 3scG.Acc this thing Acc this thing 
bardast nahi kar sakta 
tolerate NEG do  can.M.sG 
‘No father at all could tolerate this (this thing). 

d. jeb se do  rupae ka not  nikala "Jivrakhan, ise 
pocket from two rupies of note took.out Jivrakhan ^ 3sc.Acc 
rakh-lo” Jivrakhan ne not mani beg mé  rakh-liya 
put-takeimp Jivrakhan ERG note money bag in  place-took 


‘He took a two rupee note out of his pocket, “Jivrakhan, take it”. Jivrakhan 


put the note into his purse’ 


Animacy seems at first sight to be the prevalent trigger for accusative marking, while 
definiteness and specificity seem to act as an optional trigger only, as summarized in 
Aissen (2003: 469) on the basis of the dominant view in Hindi linguistics. However, the 
deviant cases can be better explained in terms of specificity or saliency as will be argued 


below. 


2.2.1 Deranking 


Human animates can, exceptionally, remain unmarked, a case of “deranking” in Aissen’s 
2003 terms: for example, variation is found with NPs that are used to refer to the func- 
tion their referents are associated with, and not to the respective individuals (4a)- (4b), 
NPs with collective reference (5a)-(5b), and NPs used in comparisons decreasing the 


referentiality of the NP (6a): 
(4) Modern Standard Hindi (own data) 


a. meri saheli ne  naya naukar rakha 
my  friend.sc ERG new  servant.M.sG  place.Prv.M.sG 


“My friend took a new servant: 


b. ve  larkà dekh rahe hai 
3PL boy.M.sG look PROG  PRS.3M.PL 


‘They are visiting a boy (a suitable groom). 


(5) Modern Standard Hindi (own data) 

a. maine bahut log dekhe, bahut 
1SG.ERG many people.M.PL see.M.PL many 
gandagi dekhi 
dirt.F.sG  see.F.sG 


‘I saw a lot of people, a lot of cars, much dirt’ 
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Car.F.PL 


dekhi, 


see.F.PL 


bahut 


much 
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b. maine bahut logo ko  dekha 
ISG.ERG many people.M.PL ACC  see.PFV.3M.sG 


‘I saw many people’ 


(6) Modern Standard Hindi (own data) 


a. tum  jaisá kot nahi dekha 
2 like.M.sG INDEF NEG  Saw.PFV.M.SG 


‘I didn't see anybody like you’ (movie title) 
b. maine ` Kei ko nahi dekha 
1SG.ERG INDEF ACC NEG saw 


‘I didn't see anybody? 


Examples such as (4) have been well discussed in the literature (Mohanan 1994; Dayal 
2011) and are analyzed in Self (2012) as an illustration of what he calls the specificity 
requirement, which, according to him, may be the main and only constraint. This con- 
straint requires the object NP to be specific in order for it to be marked. Examples such as 
(5) and (6) are less frequently discussed, but also show that human non-specific objects 
can be unmarked, when they involve a collectivity considered as an indivisible whole 
(5b) rather than a set of individuals (5a) or decrease in referentiality by a comparison in 
a negative context (6a). 


2.2.2 Upranking 


Certain inanimates and abstract nouns in the object position are very frequently marked: 
this type of upranked objects have been noted for nouns with unique referents such as 
‘moon’, ‘sun’, ‘earth’ or ‘ocean’, whose reference can be identified on the basis of shared 
knowledge. Abstract nouns such as ‘death’ or ‘time’, which belong to a different class 
and are not referential, are in fact quite frequently marked: 


(7) Modern Standard Hindi (own data) 
cad ko dekho! 
time Acc look.1mp 


‘Look at the moon!” 


(8) Modern Standard Hindi (Agyeya, 1951, Nadi ke dvip) 
ham kya samay ko / mrityu ko rok sakte hai? 
1PL INT time ACC death Acc stop can  PRS.1M.PL 


“Can we stop time, death?” 


In Spanish abstract nouns are far more often marked than concrete inanimates, since 
79% occur with the preposition a, whereas only 21% concrete inanimates occur with the 
preposition a (Company Company 2002: 209). In Hindi, non-referential abstract nouns 
can be marked, such as “glass”, “darkness”, “outside”: 


285 


Annie Montaut 


(9) Modern Standard Hindi (Self 2012, from Burton Page 1957) 
loha sise ko Fong hai 
iron glass Acc cut PRS.3M.SG 


Tron cuts glass. 


(10) Modern Standard Hindi (Vinod Kumar Shukla, 1996, Khilega to dekhenge) 


a. ham andhere ko rok dete 
1IPL | darkness Acc stop  give.COND.1M.PL 


"We would stop the darkness: 


b. hamne  püre bahar ko bond kar diya hai 
IPLERG whole outside Acc closed make give  PRF.3M.SG 


“We have locked up all the outside’ 


One might think that the whole series displays nouns like mass nouns such as ‘glass’ 
in (9), which are according to Self (2012) similar to natural kind terms, and natural kind 
terms may have the properties of definite NPs (Gross 2009). However, the fact that they 
are more often marked than other inanimates (as in Spanish), which are marked only 
when specific, both in Hindi and Spanish, requires a different explanation. The reason, 
not explored to my knowledge, maybe because such abstract nouns, with semantic rigid- 
ity, are not liable to variations of definiteness/specificity — except when they change 
status and become discrete (‘a specific blue’, ‘the very same sadness’) they tend to be 


marked for their semantic rigidity. Hypotheses along these lines should be checked in a 
distinct study. 


2.3 Syntactic properties of the object with attribute 


It has been argued that marked objects have differential control properties: no unmarked 
object can control a non-finite adjunct (Bhatt 2007), whereas propositional adjuncts are 
commonly controlled by marked objects, particularly after main verbs of perception. 
Bhatt’s 2007 examples are the following: 


(11) Modern Standard Hindi (Bhatt 2007: 17) 


a. Mina; ne bazar mē ek sailanj ko  nacte hue; dekha. 
Mina ERG market in a tourist acc dancing being see.PFV 
‘Mina; saw a tourist; dancing; in the bazar! 

b. Mina; ne bazar mē ek sailani; nácte hue dekha. 
Mina ERG market in a tourist dancing being  see.PFV 
‘In the market Mina; saw a tourist; when be: was dancing’ Ca tourist 
dancing) 


According to Bhatt Bhatt (2007), the non-finite adjunct “dancing” in (11b) can only be 
controlled by the subject of the matrix clause Mina, not by the unmarked object, whereas 
the same, when marked, controls the adjunct. However, unmarked objects are commonly 
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used with an adjunct that they control, although they are in this case typically inanimate. 
In (12a), the implicit subject of the participial clause ‘having come back / be back’ (state, 
past) is controlled by the unmarked object gari ‘car’, and in (12b), the participial clause 
(dynamic event, present) is controlled by the unmarked object jamun ‘Java plums’. Both 
sentences involve a coverb, whose subject is controlled by the main verb's subject, and 
the same control rule within the participial clause apply as in (11a): 


(12) Modern Standard Hindi (Krishna Baldev Vaid, Dusra na koi) 


a. gayi vapas ài hui dekhkar maine ` soca... 
car.F.sG back come  be.PTCP.FsG  see.cv  1SG.ERG  think.Prv.M.sc 


‘I saw the car having come back and thought..’ (not “Having come back/I 
came back and I saw the car.) 


b. kale-kale jamun  gayab hote dekhkar uske muh se 
black-black jamun vanished being see.cv his mouth from 
zāl tapakne lagi 
salive.r.sG drip start.PFV.F.SG 
‘Seeing the black Java plums disappearing his mouth started watering./ “He 
saw the black Java plums disappearing and he started salivating' 


Both participles ài hui past participle of verb ána ‘to come’ in (12a) and gayab hote, 
present participle of verb gayab hona ‘to disappear’, are clearly controlled by the object 
of the coverb. In other words, a small clause complement of a matrix verb may license 
an unmarked noun only if it is inanimate and accompanied by an attributive participle, 
as in (12), not when the noun is animate. 

The differential behavior of (11b) and (12), both with unmarked object, can be explained 
by the fact that in (11b) the unmarked object is a human being in the singular, which 
makes its unmarkedness highly atypical: a tourist in the market, as an unmarked human 
patient, must be totally devoid of individuation (like ‘people’ in example 5a), treated as 
a mere element of the bazaar. Therefore. its individuation by means of a striking event 
(dancing in the bazaar) contradicts its implicit characterization as non salient. The ‘car’ 
or the Java plums' in (12) in contrast are definite inanimates, but their unmarkedness 
conforms to the tendency for inanimates to remain unmarked if devoid of discourse 
prominence (cf. below). What is centre-staged in (12) is not the entity (‘plum’ or ear") 
but the global scenario of the disappearance or re-appearance respectively. The objects 
are not described for their own sake since what prevails for the speaker is the event in 
which the object is involved, not the object itself. 

Similar reasons account for the systematic marking of all objects with nominal or ad- 
jectival attributes, whether human or inanimate and non-specific, a fact which remains 
unnoticed in the literature on Hindi DOM. The following series (13) involves verbs with 
two objects such as ‘judge’ / ‘consider’ / ‘call’ / ‘make’ (X Y), a main object and its at- 
tribute: 


"The complex predicate gáyab honá ‘to disappear’ is formed with the adjectival unit gáyab and light verb 
ho ‘be’, here in the present participle form. 
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(13) Hindi (own data) 


a. mai cor ko / *@ cor kahta hu 


isG thief Acc thief say PRS.1SG 
T call a thief a thief? 
b. mai billi ko apna dusman / beiman manta hu 
1sG cat ACC REFL enemy disloyal consider PRS.1SG 
‘I consider cats as my personal enemies/disloyal: 
c. ve rasi ko / *Ø sap samajhte hai 
3PL rope ACC snake understand PRS.M.PL 


“They mistake a rope for a snake’ (or ‘ropes for snakes’). 


d. ve punya ko / *Ø pap banáte hai 
3PL virtue ACC sin make  PRS.M.PL 


“They transform virtue into sin? 


The marking is obligatory even for non-specific indefinite inanimate objects. Here 
the attributive adjunct, noun or adjective, does not describe an event in which the object 
could in principle be a simple element less salient than the process itself as in (12), where 
the adjunct is a mere qualification. The sentence amounts to attributing a property to the 
noun, and this attribution itself makes the noun centre-staged and not secondary to the 
property or part of it. 


2.4 Information structure 


The above examples (11)-(13) corroborate a major principle of differential object marking 
that Dalrymple & Nikolaeva (2011) as well as Iemmolo (2010) have captured with the rele- 
vance of information structure and the notion of topicality (Iemmolo 2010) or secondary 
topicality (Dalrymple & Nikolaeva 2011). The syntactic properties analyzed in 82.3 are in 
conformity with a more general tendency which holds also in the absence of syntactic 
constraints. Dalrymple & Nikolaeva (2011) assume that topical objects are marked, while 
narrow focused objects — even if definite specific - are obligatorily unmarked, giving the 
following Hindi example: 


(14) Hindi (Dalrymple & Nikolaeva 2011: 167) 


a. ham mez  paüchége 
iPL table  wipe.FUT.M.PL 


b. ham mez ko _ paúchége 
lp] table Acc wipe.FUT.M.PL 


"We will wipe the table? 


?Wide focused objects are preferably unmarked, narrow focused objects are obligatorily unmarked as op- 
posed to topicalized objects, which are marked. For a definition of wide vs. narrow focus, see Rebuschi 
8 Tuller (1999: 215). Wide focus sentences felicitously answer “out of the blue” questions such as “What 
happened?”, whereas in narrow focus at least one of the participants is given or known, such as “What did 
X do?, What did X do with Y?". 
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In (14a) “the object is construed as part of the event and is not individuated as a prag- 
matically salient element: informationally, it is part of wide focus” (Dalrymple 8 Niko- 
laeva 2011: 167), whereas in (14b) the ‘table’ was already the centre of attention. 

However, topics can remain unmarked in Hindi, either by simple fronting (15a) or 
fronting with topic particle (15b), which suggests that topicality, whether secondary or 
primary, is not in itself responsible for the marking of objects. 


(15) Hindi (own data) 


a. yah film kisne dekhi? 
this film.r.sG who.ERG  see.PFV.F.sG 


"This film, who saw it?’ 


b. yah bat to bom sab jante haï 
this thing TOP mt all know  PRS.PL 


“This thing, we all know it? 


Besides, internal objects, which are, by nature, very low in topicality, may be marked 
and statements such as (16) are in no way exceptional: 


(16) Hindi (Vinod Kumar Shukla, 1996, Khilega to dekhenge) 
zindagi ko mg  sthagit maut ko jina hai 
life Acc live postponed death {acc live is 


“To live life is to live a deferred death? 


The reason why some topics remain unmarked whereas some internal objects are 
marked is, again, related to how the speaker wishes to represent the situation involving 
the object: even a topicalized object may be deprived of saliency in comparison with 
the process that it is part of (knowing in (15b)) or with the focus in (15a), and thus can 
remain unmarked, since it is the event or the focus, and not the object, that is discursively 
salient. An initial sentence like (15b) can be followed by a proposition discussing its 
whole content (“but we don’t care”), but not bearing on the topicalized notion (“this thing 
is most important or interesting”).? In contrast, internal objects, if emphasized for the 
purpose of parallel contrast as is the case in (16), acquire sufficient saliency to be marked: 
this is not really life that we are living, it is rather like living death. Semantically the 
added meaning to ‘life’ is its opposite (‘death’), hence the marking. Without marking, the 
object comes back to its ordinary status as an internal, non-individuated object, which 
is part of a process from which it cannot be dissociated. 

In a discourse with no particular constraints, the same reasons account for the mark- 
ing ofthe vast class of optionally marked inanimate objects. In (17) for instance, the same 
object “door” occurs first as marked and then as unmarked, although the first occurrence 


"Por instance phir bhi log is saccai se dür bhagte hai however people run away from this truth’ [that jisne is 
dharti par janm liya hai use mrityu prapt hogi ‘whoever was born on this earth will die’] (Bollywoodtadka). 
A continuation bearing on the topicalized notion requires an initial sentence with a marked object (is bat 
ko). One may hypothesize that both sentences in (15) have a focused constituent, which makes topicality 
less prominent. 
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refers to an indefinite, and the second has more specifying properties since it does not 
refer to just any ‘door’, but to “our own’ door. 


(17 Hindi (Vinod Kumar Shukla, 1996, Khilega to dekhenge) 
ek  darvaze ko bond kar, hamne pire bahar ko band kar 


one door Acc close doc 1PL all outside Acc close do 
diya hai. ... apne kamre ka darvaza band kar sari 
give.PFV.3M.SG PRT REFL room of door closed do.cv all 


duniya ko  bahar band kar diya. 
world acc outside closed do  give.PFV.3M.SG 


“By closing a (mere) door, we have locked up the whole outside. (...) By closing 
the door of our room, we have locked outside the whole world. 


The door in the first sequence, although appearing as new information and not specific, 
is singled out as responsible for huge consequences, in contrast with its triviality: hence 
the marking. In the second occurrence, this disparity is already given, and it is the event 
as a whole (to lock oneself in one's room) that is emphasized: hence the absence of 
marking. 

In (18), this object is already present in the anterior context, where the village head 
asked the master, Guruji, to open a lock on a door. In (18a), lock, the object, is topical- 
ized by its position and it is definite, however it is not marked: what is emphasized is 
the inference of the speaker's ability of the speaker to do the unlocking, since he had 
locked the door himself. Besides, the subject is focalized (preverbal position). In contrast, 
in the very next sequence, the same lock, again in a topic position (18b), is given centre 
stage because the protagonist is confronting it for itself (testing its solidity), and since, 
in segment (18c) as well, the process singles out the lock (and key) as the centre of ev- 
erybody's attention, although it is non-topicalized. When the protagonist goes to open 
the lock, everybody's attention shifts from the lock to the process of opening the lock: 


(18) Hindi (Vinod Kumar Shukla, 1996, Khilega to dekhenge) 
a. ‘Yah tala maine xud  lagàya hai, 
this lock 1SG.ERG REFL put.PRF 3M.SG 
“This lock, I put it myself; 
b. tale ko | Gurüj ne  jhanjhanaya. (...) 
lock Acc Guruji ERG shake.Prv.3M.sc 


“Guruji shaked the lock noisily’ 


c. Mai tale ko khol  sakta hu, cübi mere pas hai. 
isG lock Acc open can  PRS.IM.SG key se near be.PRs.3sG 


‘I can open the lock, I have the key with me’. 


^As confirmed by his wife's insistence on the act of opening, totally backgrounding (omitting) the object: 
bahar 'khari uski stri ne kaha ‘mai khol du?’ “His wife, who stood outside said ‘Shall I open it (myself)? 
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d. Unhóne cübi tét se _ nikali. Ve tala kholne ja 
3.HON.ERG key belt asi take.out.PFV.F.sSG 3HON lock open go 
rahe the. 


PROG  PST.3HON 


“He took the key from his belt. He was going to open the lock. 


What such examples highlight with marked objects is their saliency (Croft 1991: 155; 
Montaut & Haude 2012), a notion I am invoking in the sense of Dalrymple & Nikolaeva 
(2011: 14-15, 57) on the role played by a referent in the pragmatic structure of the propo- 
sition, rather than Neess’s (2004) more general interpretation of the term (which focuses 
on the question as to which entities are of greater interest for human perception in gen- 
eral). 


3 Particular clause types in Hindi 


3.1 The case of non-promotional passive 


A characteristic of the Hindi passive, apart from the fact that it applies equally to intran- 
sitives, is that it is very frequently non-promotional, and retains the object marker ko 
for the noun which is the corresponding object in the equivalent active clause, with the 
result of blocking the agreement (cf. example (1c) above). The conditions for marking the 
ex-object are not the same as those form marking the object in an active sentence and 
an attempt is made below to define them better. Given the fact that promotional passive 
is also frequent, and consequently marked objects in the passive are less frequent than 
in the active, one would expect that the obligatorily marked objects of an active sen- 
tence such as a human referential object is better retained in the passive sentence than 
inanimate objects, which are only optionally marked in the active sentence.’ But this 
is not the case. Unmarked human patients which are absolutely compulsory in active 
sentences, such as first person pronouns (19b) or proper nouns (20), are quite frequent, 
as are marked inanimates in (21) and (22): 


(19) Modern Standard Hindi (own data) 
a. mujhe aspatal — le jaya gaya 
1SG.ACC/DAT hospital take go  PASS.PFV.3M.SG 
‘I was taken to the hospital? 
b. mai aspatal le jayi gayi 
isG hospital take go  PASS.PFV.F.SG 


‘I was taken to the hospital’ (feminine speaker) 


"ln keeping with Aissen’s (2003: 468) “basic hypothesis: if overt marking is possible with direct objects with 
property a, then it is possible with direct objects with property f, where $ dominates a”. 


291 


Annie Montaut 


(20) Modern Standard Hindi (Times of India, January 2013) 
Sef Hemant Oberay apne das sahyogiyó  kesath vaha  bheje gae 
Chef Hemant Oberoy REFL ten helper.m.pL with there send Pass 
the 
PPRF.M.PL 


“The chef Hemant Oberoi had been sent there with ten of his helpers’ 


(21) Modern Standard Hindi (Times of India, January 2013) 


mere hazaró | samarthakó ko Madurai ikai se nikal diya 
my thousand supporter.m.pL Acc Madurai unit from expel give 
gaya hai 


PASS PRF.M.SG 


“Thousands of my supporters have been ousted from the Madurai unit: 


(22) Modern Standard Hindi 


a. mrtyu ko / samay ko rokā nahi ja  sakta 
death — Acc time ACC stop NEG PSV  can.PRS.3M.SG 
“Death / time cannot be stopped. (single entities, common knowledge) (own 
data) 

b. par bahut dinó tak sthagit maut ko bh nahi jiya ja 
but many days til postponed death acc even NEG live Pass 
saktá 
can.PRS.3M.SG 


‘But one cannot live even a deferred death for very long’ (Vinod Kumar 
Shukla, 1996, Khilega to dekhenge) 


c. unke vilamban ko 24 janvari kī subah us vaqt kiya 
3PL.GEN suspension Acc 24 January of morning that time do 
gaya jab... 


PASS.M.SG when 


“Their suspension occurred on the morning of January 24 when... (Times of 
India 13/1/2015) 


The marking of such inanimates, which are essentially compact abstract nouns, is 
common to active and passive sentences. The non-marking of human patient in contrast 
is possible only in passive sentences. The fact that the marking of abstract nouns such as 
in series (22), is maintained irrespective of the construction, whether active or passive, 
seems to suggest that this category may be deemed as ranking high in the hierarchy of 
markable objects. 
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3.2 Reduced passive clauses 


Passive nominalizations do not confirm this equal frequency of marked human and inan- 
imates, since human objects behave quite differently from inanimates in reduced passive 
clauses, and there is a triple distinction for inanimates. In Hindi, the nominal or adver- 
bial reduction of a clause, whether active or passive, requires the genitive marking of its 
subject when distinct from the main subject ((23a) and (23b)), with a few exceptions (23c) 
corresponding to nouns analyzed as pseudo-incorporated (Dayal 2011) and analyzed in 
Montaut (2012) as anti-salient, or as having extremely low individuation.* 


(23) Modern Standard Hindi (own data) 

a. apka yahá aná mujhe bilkul accha nahi laga 
2H.cEN here come.INF 1SG.DAT really really NEG — seem.Prv.M.sc 
‘I did not like at all (the fact) that you came here’ 

b. ram ke ate (rëm Ate) hi sab gayab ho 
Ram GEN coming Ram coming just allm..L disappeared be 
gae the 
go  PPRF.M.PL 
“Right after Ram came, all had disappeared" 


c. andherá (“?ke) hote hi sab gayab ho gae the 
darkness GEN being just allm.pL disappeared be go  PPRF.M.PL 


“Right after the coming of darkness all had disappeared? 


In the nominalized passive clause, the patient is in the subject position and can ei- 
ther be marked by accusative ko, by the genitive or unmarked, depending on the type 
of passive (promotional or not) and on the type of (promoted) object (animate vs. inani- 
mate referent, individuation). While a human patient is obligatorily marked in the active 
and optionally marked in the passive, the nominalized clause echoes both possibilities 
with the optionality of a regular subject marking in the genitive and a retention of the 
accusative marking, but it cannot remain unmarked (24): 


(24) Modern Standard Hindi (Bhatt 2007: 9) 


Rina ka / ko / *D bazar mē dekha jana saram ki 
Rina GEN / DAT market in see PASSINF shame of 
bat hai. 
thing is 


“For Rina to be seen in the market is a matter of shame? 


In contrast, inanimate nouns may either be marked as ordinary subjects, retain their 
object marking or have no marking at all like the so-called incorporated objects: 


*The way Dayal (2011) and Mohanan (1994) define incorporation excludes the morphophonological features 
usually associated with the notion, hence the suggested appellation of "semantic incorporation" (Dayal 
2011). 
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(25) Modern Standard Hindi (Bhatt 2007: 9; author's translation) 
Per ka / ko / Ø is tarah kata jana saram ki bat hai 
tree GEN DAT this way cut  PAssINF shame of thing is 
"Ihe fact that the/a tree was cut in this way/this kind of tree cutting is a matter of 
shame. 


3.3 The opposite type of noun-verb relation: "Incorporated" objects 


Example (25) shows a distinct meaning of the unmarked noun, devoid of any individua- 
tion to the point of being incorporated. The notion of (semantic) incorporation in Hindi 
was elaborated by Dayal (2011) to account for a type of bare nominals with special behav- 
ior, particularly in disallowing pronominal anaphorization. Such objects fail to control 
agreement in sentences ordinarily constraining object agreement, namely ergative sen- 
tences involving a complement infinitive (26), and abilitative or obligative sentences with 
transitive main verb in the infinitive (27). The standard Object-Verb agreement occurs 
in (26b) and (27b), where the feminine object saikil ‘bike’ controls the agreement of the 
matrix verbs 'do' and “come” as well as the infinitive “drive”, which in Hindi, may vary 
in gender. In (26a) and (27a), on the contrary, it does not vary, and the infinitive remains 
in the masculine form, controlling the agreement of the matrix verb, as do intransitive 
verbs (26c): 


(26) Modern Standard Hindi (own data) 
a. bacce ne saikil calana Suru kiya 
childm.sc ERG bike.r.sc driveinr.MsG beginning  do.Prv.3M.sc 
‘The boy started to ride a bicycle’ (has started bicycle riding) 


b. bacce ne saikil calàni Suru ki 
childm.sc ERG bike.r.sc  drive.INF.F.sG beginning  do.Prv.3r.sc 


‘The boy started to ride a bicycle? 
c. bacco ne skul jana Suru kiya 
child.m..L ERG school go.INF.M.sG beginning  do.Prv.3M.sc 


“The children started going to school. 


(27) Modern Standard Hindi (own data) 
a. mujhe  saikil calana ata hai 
1SG bike.r.cs  drive.INF.M.SG come  PRS.3M.SG 
‘I know how to ride a bicycle (how to cycle). 
b. mujhe saikil calani ati hai 
1M.SG bike.r.sG  drive.INF.F.SG come  PRS.3F.sG 


‘I know how to ride a bicycle: 


294 


10 The rise of differential object marking in Hindi and related languages 


In (26a) and (27a) the constituent triggering agreement is the whole infinitival clause, 
sometimes considered to be an instance of incorporation of the object into the verb since 
saikil calana “bicycle drive” it behaves in this respect like an intransitive verb. 

Although both alternating constructions can be used in similar unmarked contexts, 
there is a preference for the non-agreeing type, with some conventional object-verb ex- 
pressions like ‘drink tea’ or ‘buy vegetable’.’ Here, the infinitive triggers agreement on 
the matrix verb: 


(28) Modern Standard Hindi (own data) 
mujhe sabzi kharidna / ?? kharidni hai 
1SG vegetable.F.sG ` buy.INF.M.sG buy.INF.F.SG  be.PRs.3.5G 


‘I have to buy vegetable’ 


To summarize, only "incorporated" objects with very low individuation can dispense 
with indexation on the verb in the relevant clause types. Marked objects pattern at the 
opposite side ofthe following hierarchy of objects: incorporated (unmarked) » unmarked 
(non incorporated) » marked. 

The triggering feature for this triple syntactic differentiation is individuation. It is not, 
directly, topicality, nor is it the role played within the focus, although of course, the 
semantic feature individuation is also relevant in information structure. 


4 The emergence of object marking 


gth cen- 


Most scholars do not date the emergence of Modern standard Hindi before the 1 
tury. Previous to this stage, the language, although it is systematically called medieval 
or ancient Hindi, is expectedly not standardized, and as such it is much closer to some of 
the regional varieties today analyzed as independent languages. What is generally called 
"Old Hindi" is the so-called sant bhasha, a poetic language forged by the first mystic poets 
who expressed their religious opposition to the brahmanic world order by using popular 
vernacular speech instead of Sanskrit. This language, which was first used by the devo- 
tional mystic Kabir (14th c.), and later by Mira Bai (16h c.), has been fairly well studied 
and shown to display various regional features, taken more from the Eastern languages 
in Kabir, and more from the Western varieties in Mira, but fused in what will become the 
literary koine of medieval Northern India. In what follows I will discuss the three main 
stages of the DOM evolution in pre-modern “Hindi”. 


4.1 First New Indo-Aryan stage: 14% century 


During the first stages of Hindi and of other New Indo-Aryan languages (NIA), the inflec- 
tional system of Sanskrit is in the process of being replaced by adpositions (nominal cat- 


"Similarly, in ergative sentences, like ‘I began/wanted to drink tea’ or ‘I wanted to buy vegetables’ minimal 
individuation is required for the object of the complement infinitive to trigger agreement, and agreement 
with the object is highly improbable with the bare noun (as opposed to ‘I wanted to buy various vegetable’ 
or ‘drink this excellent tea’. More examples in Montaut (2012). 
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egory) and auxiliaries (verbal category). Yet this process is far from being completed and 
the absence of clear relators is a common feature of ordinary discourse, the few oblique 
cases maintained in the language being used for various syntactic purposes: the -i loca- 
tive for the agent in past transitive processes, and a syncretic oblique -hi (derived from 
the fusion of the old dative/instrumental already achieved in Middle Indo-Aryan) for all 
kinds of obliques. Most of the time, nouns remain unmarked, the meaning being easily 
recoverable from the context and sequence since sentences are usually minimal. This 
-hi ending is the most frequent marker of objects in Kabir, whereas the postpositional 
marking (ku/kau) is just starting to appear (Strnad 2013: 325), but in both inflectional or 
adpositional cases, the marking is far from systematic. 

Human objects, including proper names, are either marked (29b) or unmarked (29a), 
and sometimes in the same sequence both marked and unmarked proper nouns oc- 
cur (29b): 


(29) a. Hiranakasa maryau. 
Hiranakashyapu ` kill prv.m.sc 


“[He] killed Hiranyakashyapu. 

b. Ràmahi  janai janai Rahimána. 
Ram.Acc know.prs.3sG_  know.PRs.3sc Merciful.Q 
'[He] knows Ram, he knows the Merciful’ (Kabir, verse 302) 


Even a proper name, if occurring with a predicative adjective, can be unmarked (30a), 
whereas other human referents can be marked (30b): 


(30) a. Rama. kari sanehi. 
Ram make.cv dear 


“Making Ram your dear? (Kabir, verse 381) 


b. apana ádha aura ků  kahai kanáná 
self blind other Acc say.PRS.38G one-eyed 


‘[Being] himself blind, he will call others one-eyed. (Kabir, verse 149) 


1st 2nd 


The only category which is systematically marked is the personal pronoun (1% and 
person), and occurrences of the 3rd person are frequently unmarked even when referring 


to a human entity. 


(31) jaga.O mai desu jaga | na desi mohi 
world  1sG  see.PRsisG world NEG  see.PRs.38G  1SG.ACC 


‘I see the world, the world does not see me. (Kabir, verse 76.3) 


Given the fact that humanity, which is today the main (compulsory) trigger for object 
marking, does not apply, we would expect that inanimate objects are systematically un- 
marked, but this is not the case, and the marking of inanimates seem to be as random as 
the marking of human objects. Example (32) for instance displays two parallel clauses 
patterning identically, with the same construction, the same semantic class of objects 
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(the so-called class of single entities), the same relation between predicate and object 
and the same ordering of both sequences. Yet ‘ocean’ is marked and ‘sun’ is unmarked: 


(32) ulati Ganga sáamudra-hi sosai, sasihara  sura.O  grásai 
reversed Ganga  ocean-ACC dry.up.PRS moon sun swallow.PRs 


"Ihe reversed Ganga dries up the ocean, the moon swallows the sky: 


The adpositional marking by means of kü/kau, infrequent and more recent, occurs in 
similar conditions, and most often without apparent reason. In (33), we may hypothesise 
that the relative pronoun is topicalized since the Hindi correlative system amounts to 
topicalizing the relative clauses (Gupta 1986; Montaut 2012) in the same way as condi- 
tionals (Haiman 1978), but in (34), the noun pada ‘word, line’, which is a marked object, 
is not the head of the relativized expression: 


(33) jakú yahu jaga  ghini kari cālai 
RELACC DEM world horrible do.cv go.PRS.3SG 


‘That which this world avoids with disgust: (lit. that which considering horrible 
the world goes by) (Kabir, verse 185.4) 


(34) ya pada ků bujhai takú tinyú  tribhuvana 
DEM verse Acc understand.PRS.35G 3SG.DAT three world 
sujhai 


think.pRs.3sG 
“[Who] understands this pada, he knows the three worlds. 


In (34) the reason why the inanimate is marked is probably, apart from definiteness 
(not in itself a triggering factor as shown by (29)), the intrinsic importance of the word 
‘word/verse’ in the ideological context of the time: for a devotional mystic nothing is 
more central and more emphasized than the deity’s speech, or the word pointing to the 
deity. What is also noticeable is the parallel marking of the marked object (jaku, pada 
ku) and the dative subject (taku) by the same postposition in (33). 


4.2 Second stage: 16 century 


In 16'h century classical texts like Tulsidas Ramayana (T), the inflectional marking (-hi) 
is maintained yet the postpositional marking occurs more often, in conditions similar to 
the ones in stage 1: pronouns for 1% and 2" person are consistently in the oblique, (35) 
and (36), as in the stage 1, and the same oblique form is also used for oblique subjects (36). 
But unlike the earlier period, human objects are systematically marked ((35) and (37)), 
and only exceptionally unmarked, either as proper nouns or pronouns (38): 


(35) tehi na = jana nrpa.O, nrpa-hi | so jana 
3SG.OBL NEG know ppv king king-oBL 3sG  know.Prv 


"Ihe king did not recognize him, he recognized the king? (T 140) 
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(36) kaha tapas  nrpati0 janau tohi ` lag bhala mohi 
said hermit king know.sG  2.0BL seem good  1SG.OBL 


“Said the hermit: “I know you as the king [this move] pleased me/I liked” (T 160) 


(37) a Raghupati-hi nihari ` prabhü-hi citai. 
Sunlord-oBr  look.cv Lord-opt  look.cv 
“[Sita] seeing Rama (king of sun linage); [Sita] looking at the Lord. (T 140) 
b. Siya-hi — biloki. 
Sita-OBL  see.cv 
[Ram] ‘Looking at Sita’ (T 250) 


(38) a. Ram  biloke log LI citai Siya krpayatan jāni ` vikal 
Ram  see.prv people look.cv Sita gracefully knew worried 
bisesi. 
special 
‘Ram saw the folk, [...] looking at Sita with mercy he perceived her great 
distress. (T 251) 

b. rau trsit nahi so | € pahicana 
prince thirsty NEG  3SG recognize.PFV 


"Ihe king, overcome by thirst, did not recognize him. (T 158) 


Whereas the unmarkedness of the collective log ‘people’ is still possible (cf. $1), the 
zero marking of the proper noun Sita is no longer grammatical, even though it was quite 
usual two centuries earlier. Indeed, all instances of X looks at/sees Y exhibit marking of 
proper names in (37): whether Ram looks at Sita or Sita at Ram, whatever verb is used 
(cita look at/gaze”, nihar ‘see/look’, bilok ‘see/look’). 

Another difference with the previous stage it seems to be a more frequent marking of 
nouns in small clauses (39) - which however is still not systematic (40) - even when the 
small clause includes a participle (41): 


(39) bhale-hi manda  manda-hi bhale karahu 
good-oBr vile vile-oBL good  do.PRS.25G 


"You debase the good man (make vile the good), you praise the vile? 


(40) kol  biloki  bhüpa bara dhira bhagi paith  giriguhá 
boar see.cv king much determined flee enter mountain.cave 
gabhira 
deep 


‘Seeing the king so much determined the boar entered a deep cave’ 


(41) jo  prabhú tumah  bipin  phirat dekha 


REL Lord 2PL forest roaming see.PFV 


“The Lord whom you saw roaming in the forest. 
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Examples (37) to (41) are from Tulsidas Ramayana, in an Eastern variety (Awadhi), but 
in the Western dialects the situation was similarly unconstrained. Even proper nouns can 
remain unmarked, as was the case in the first stage: 


(42) mā n mhá  liyà Govinda mol 
sister INTERJ 15G  take.Prv Govinda buy.cv 


“Sister, I bought (and took) Govinda [a name for Krishna]. (Mira Bai, 16th c.) 


4.3 Third stage: 17-18" centuries: the modern system 


There is not much to comment after the 17" century since the system does not present 
noticeable differences with the modern system. The literature available during this pe- 
riod makes a more liberal use of Persian idioms and structures (particularly ezafe for 
determination of nouns) than in earlier Hindi and today standard Hindi. Ezafe specified 
objects can be either marked (in (43a) “fire of torment’) or unmarked (in (43b) ‘heat’): 
what prevails is the degree of topicality in the discourse: 


(43) a. wafadari ne dilbar ki bujhaya atis-e-gam 
faithfulness.r.sG ERG lover of extinguish.prv.3m.sG  fire-of-torment 
ko 
ACC 

b. ke garmi dafa kartā hai gulab ahista  ahista 
that/as heat off make prs.3mscG rose slowly slowly 


“My faithful love has quenched the fire of my love (roc), as rose dispels the 
effect of heat, step by step’ (Wali, mid. 17* c.) 


(44) jab só dekha | nahi  nazar-bhar | kakul-e-muskin-e-yar 
when from see.Prv NEG  glance-full locks-Ez-scented-Ez-beloved 


“Since I did not see fully her [my love's] scented locks (Foc). (Wali, mid 17h c.) 


In the two parallel constructions (X diminishes Y) of (43), the first object, an abstract 
NP, is extracted and put in a postverbal position at the rime, in conformity with its 
discourse function, since love torment is the main topos of the poems. It is marked. In 
turn, the second object, also an abstract noun, remains preverbal as an ordinary part 
of the wider focus and is unmarked. However, in (44), an ezafe-specified object similar 
to (43a), ‘scented locks of the beloved’, remains unmarked although concrete and in a 
postverbal position; even though it is strongly emphasized by its position, it is not given 
centre stage. Discourse saliency is the triggering factor, as it is today for inanimates. 

Objects are always marked when controlling nominal or adjectival adjuncts, either 
relative pronouns with inanimate reference (whereas relative pronoun with human ref- 
erent could be unmarked in the earlier period) or nouns, inanimate as well as animate: 


(45 ke jisko kasine kabhi và na dekha 
that REL.ACC INDEF.ERG ever open NEG  See.PFV.M.SG 


“That which (Acc) nobody has seen bloom? (‘which’ = mera dil ‘my heart’) 
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(46) kiya mujh isq ne  zalim ko ab ühistaà  ahista 
do.?rv.3sco my love ERG despot acc water slowly slowly 
ke ātiś gul ko  karti hai gulāb āhistā āhistā 
that/as fire flower Acc do PRS.3FSG rose slowly slowly 


“My love has melted the despot (made this despot water), step by step, as fire 
distil (make the flower rose-perfume) the essence of rose, step by step. (Walid, 


mid 17% c.) 


The following tables provide an overview of the different referent types according to 
the animacy and definiteness scales (1), and the syntactic constraints (2). 


Table 1: Animacy and definiteness constraints on DOM 


Stagel:  Stage2:  Stage3: Modern 


14 c. 16 c. 17-18 cc. Hindi 
Human SAP pronouns always always always always 
objects Proper noun optional frequent always always 
Third person pronoun optional frequent always always 
Other human nouns optional frequent always always 
Inanimate Specific nouns optional optional optional optional 
nouns Abstract (compact) nouns optional optional frequent frequent 


Table 2: Syntactic constraints on object marking 


Stage 1: Stage 2: Stage 3: Modern 
14 c. 16 c. 17-18 cc. Hindi 
Human noun in small clause optional frequent always always 


(human referents) 
Inanimate noun in small clause optional optional frequent very frequent 


with participle 


Passive finite clauses no data no data optional optional 
(human, (human, 
inanimate) inanimate) 


The only objects obligatorily marked in stage 1 and 2 are first and second person pro- 
nouns, whereas neither person names, nor titles and nouns referring to culturally promi- 
nent persons are consistently marked. Objects controlling adjectival / nominal adjuncts 


Table 1 does not take into account the cases of deranking. In Table 2 data is lacking for passive transfor- 
mation in the earlier stages of the language since passive was rare and always with a modal meaning of 


incapacity. 
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are still only optionally marked before stage 3 (17'b c.). Inanimates are optionally marked 
right from stage 1 as well as animates other than first person pronouns. At no stage was 
marking used as a distinguishing device, contrary to Comrie's (1979) or Croft's (1988) hy- 
pothesis, and in accordance with the observations made by Malchukov (2008: 213) that 
the discriminatory function is quite rare across languages, by Arkadiev (2009) that it is 
not relevant for Indo-Iranian languages, and by de Hoop & Narasimhan (2005) that it is 
absent in Hindi. 

Regarding pronouns, the consistent marking from the very beginning of 1% and 2d 
person pronouns should not be over-emphasized, since they retained the accusative in- 
flection till the late Middle Indo-Aryan stage as opposed to all other nominal categories 
(including 3rd person pronoun: unmarked in (38b) and marked in (35)). Table 3 is ac- 
cording to Bubenik (2006) the table of pronominal forms in the late Apabhramsha stage 
(10-11' c.): 1** and 2nd persons retain an accusative, which is distinct both from nomi- 
native and from dative/ablative, whereas accusative is fused with nominative for the 3rd 
person: 


Table 3: Syntactic constraints on object marking 


1t — Now hau/haŭ (Sk aham) ACC mai, mai, mai, me (Sk mam) paT mujjh 
29d wow tuhu/tuhú (Sk tvam) Acc pai, pai, tai (SK tvam) DAT tujjh 
and Nom/acc so (M.so), sa (f.sc) 


A considerable morphological restructuring of the system occurred between this stage 
and the first stages of New Indo-Aryan, with the genitive in -r- in most regional varieties, 
and various oblique forms depending on the region, which came to be used both for 
marked accusative and dative (36), before adpositions substituted for inflectional mor- 
phemes with the same bi-functional use as the old dative/accusative. Remarkably, mod- 
ern standard Hindi maintained the oblique form mujh and tujh before postpositions and 
the old inflectional forms mujhe and tujhe for dative/accusative, in alternation with the 
adpositional forms mujhko and tujhko, and it extended this system to the third person: 
the direct (vah) and oblique (us) cases are distinct, and in the dative accusative there are 
two alternate forms, one inflectional (use) and one adpositional (usko). Yet the fact that 
the distinctive accusative was retained throughout Middle Indo-Aryan certainly played 
an important role in the marking of 1** and qna person pronouns, in contrast with third 
person pronoun and other nouns. 


5 Object marking in the “dialects” of Hindi 


A good deal of ambiguity prevails in the field of language description since these lan- 
guage varieties are considered, administratively and politically, as dialects of Hindi, with 
various names and inner variation. Yet linguistically the variations in comparison with 
modern standard Hindi are so important that many regard them as distinct languages: 
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nominal and verbal flexions are different, some languages (like Bhojpuri, Awadhi, Mai- 
thili) ignore grammatical gender and ergative alignment, others have three grammatical 
genders or had them until recently (Western Rajasthani, or close to the three genders, 
languages like Gujarati and Marathi). The modern stage of these “dialects” itself displays 
great variations regarding the object marking, some maintaining the old situation as 
sketched above, some closer to the system of standard Hindi. A comprehensive repre- 
sentation of the whole picture, that involves 331 distinct varieties, out of which there 
are at least a dozen distinct languages, is obviously outside the limits of this study. I will 
therefore limit myself to the presentation of a few features that are distinct from modern 
standard Hindi, and that may help explain the general trends in the evolution of object 
marking. 

Let us begin with a step ahead of the evolution of standard Hindi, if one takes agree- 
ment to be a reliable marker of the integration into the grammatical system. In Hindi 
the verb never agrees with a marked object, and indexation on the verb is only by de- 
fault. In Marwari, a Western dialect of Rajasthan, like in Gujarati, on the contrary, the 
marked object is indexed on the verb (gender number agreement) as shown by (47). This 
is also the case in Magahi, an Eastern dialect (Bihari) that shows agreement with marked 
objects, though somewhat differently, since all animate participants are indexed in the 
verb (48): 


(47) Marwari (Khokhlova 2001) 
mhai Saran nai  dekha 
1SG Sharan.F.sG ACC see.F.SG 


‘I have seen Sharan: 


(48) Magahi (Verma 1991) 
ham dekh-l-i ham dekh-l-i-a ham dekh-l-i-ain 
1sG see-PST-1 1SG see-PST-1-3nonH 1SG see-PST-1-3H 


‘I saw it? ‘I saw him (servant), ‘I saw him (guru). 


One could also argue that indexing the marked object the same way as the unmarked 
object is not a step further if it is expected that marked objects should also be indexed 
in a marked way. Yet no example in the various stages of object marking in Indo-Aryan 
displays an agreement with marked objects prior to agreement with unmarked objects 
(the ergative pattern precedes the emergence of DOM by far), whereas all other examples 
point to the agreement blocking effect of DOM. 

Another factor observed in certain regional varieties which is at discrepancy with stan- 
dard Hindi and its historical emergence is the correlation in object and subject marking: 
in 19 century Kumaoni for instance, no marked object occurs with an ergative agent, as 
in (49), even when controlling an adjunct, as in (50), whereas with a nominative subject, 
objects are marked when human or specific centre-staged inanimates, as in (51): 
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(49) Kumaoni (Grierson 1903-1928: IX-4) 
myara  dagariyana.le ek | báàman ` pakaro 
my companion.M.PL.ERG one Brahmin  seize.PFV.M.sG 


“My companions captured a Brahmin,’ 


(50) Kumaoni (Grierson 1903-1928: IX-4) 
prithviim lag yo  pahar hamari thātī raci dev.le 
earth.oon too this mountain our place.to.live makeprv god.ERG 


“God made this mountain a place to live for us on earth too’ 


(51) Kumaoni (Grierson 1903-1928: IX-4) 
tab ù wi | bwaj kani apan ghar huni li āy 
then 3sG DEM load Acc REFL home ALL take come.Prv 


‘Then he brought this load to his house? 


According to Stronski (2013) and Sharma (1987), object marking has now in some va- 
rieties started to extend to ergative sentences: a modern development, still unknown in 
standard Kumaoni, as in (52a), which contrasts with Garhwali, very closely related to 
Kumaoni and part of the same sub-group of Pahari languages, which allows the object 
to be marked in presence of an ergative agent, as in (52a): 


(52) Kumaoni (Krzysztof Stronski p.c.) 


a. mile naunai “sani bait le | mare 
ISG.ERG child Acc cane Ns strike.PrFv 


‘I hit the boy with a cane. 

a. Garhwali (personal field work) 
mina  naunai lai  / sani (nauno.Ø) bait na mari 
ISG.ERG. child Acc acc child cane INS strike.PFV 
‘I hit the boy with a cane. 


In Garhwali, the marked object is allowed in a sentence displaying an ergative agent 
right from the first attested texts collected by Grierson (1903-1928) (53), whether the ob- 
ject is inanimate or human, but it is not compulsory even today, except for proper names 
(54). Its optionality is not constrained by the presence vs. absence of an ergative agent. 
In folk songs, which are linguistically archaic, it is optional, and prosodic considerations, 
for instance, may possibly apply, as in (55) where a married girl is not allowed to visit 
her family. The objects nouns consisting of one long syllable and one short, ‘mother’, 
“brother” are marked, while nouns with two long syllables “father”, “sister in law”, “sister” 


remain unmarked. 
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(53) Garhwali (Grierson 1903-1928: IX-4) 
ve-na sattu sani ve talau mà dal dini/dine 
3SG-ERG sattu Acc DEM lake in throw gave 


“He threw the sattu (a sort of cereal) in the lake? 


(54 Ta Anil Rawat  tai/sani/ku jandi cha? 
256 Anil Rawat acc/acc/acc know PRS 


'(Do) you know Anil Rawat?” 


(55) a. chorie mu jan na deule 


girl 1$G go NEG give.FUT.1SG 
“girl, I won't let you go (GCT 124) 


b. tero ` bon yakhi  bulaulo, mu jan na deule 

your father here  callrur 1sG go NEG give.FUT.1SG 

T will invite your father, I won't let you go. 
c. teri ami ku | yakhi bulaulo, mu jan na deule 

your mother acc here  callrur, 1sG go NEG give.FUT.1SG 
d. tere bhai ku | yakhi bulaulo, mu jan na deule 

your brother Acc here callrur 1sG go NEG give.FUT.1SG 
e. teri bhabhi yakhi bulaulü, mu jan na deule 

your sisterinlaw here  callrur woe go NEG give.FUT.1SG 
f. teri didi  yakhi bulaula, mu jan na deule 


your sister here callrur oe go NEG give.FUT.1SG 

T will invite your mother, I won't let you go. I will invite your brother, I 
won't let you go. I will invite your sister-in-law, I won't let you go. I will 
invite your elder sister, I won’t let you go’ 


Similarly in Bhojpuri, which is not a language as closely related as Garhwali and Hindi, 
popular songs display unmarked human objects such as ‘my child’ in (56a), whereas 
in modern speech a similar object ‘my son’ is obligatorily marked with the dative/ac- 
cusative marker ke in (56b): 


(56) Bhojpuri (Saxena 1937 [1970]) 


a. 
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apnà  bàlaka mohi dite, apnà balaka nahi 
REFL  male-child 15G.DAT give.COND.ISG REFL  male.child NEG 
debo 

give.FUT.1SG 

‘If you give me your son — I will not give [you] my son’ 

tu apna laika ke  bheja 

2SG REFL boy Acc  send.ImP 


“Send your son: 
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Obligatoriness in flagging human objects giving the priority to human referents over 
inanimates is clearly a recent phenomenon, in Bhojpuri as well as in Hindi, and it is lim- 
ited to certain dialects. Discourse related triggers are active everywhere, and affected- 
ness does not play a noticeable role (cf. (1) and (34)). As for agreement, it is exceptionally 
present in the Hindi belt and has been attributed to contact with in the case of Magahi 
(see example (48)): Verma (1991) suggests that this peculiarity of the language, which 
also presents numerous cases of double agreement, results from contact with Mundari, 
an Austro-Asiatic tribal language spoken in central-eastern India. With few exceptions, 
DOM can co-occur with differential agent marking when the subject is an ergative agent, 
which is a clear indication that the discriminatory function is weakly relevant. It never 
co-occurs with an experiencer subject in the dative case, nor did it in stage 1 (34), because 
DOM is strictly restricted to formally transitive clauses, while experiential clauses, even 
with two arguments, are not transitive sensu stricto. 

The fact that the accusative marker is morphologically identical to the dative marker, 
whatever the form of the marker in the various dialects, also accounts for this situation. 
In Dravidian languages, where the accusative is distinct from the dative, in Tamil for 
instance, -ai/e (acc) vs. -akku (DAT), such a constraint does not hold: 


(57) Tamil (own data) 


enakku  avarai pidikkum / teriyum 
1SG.DAT 3M.SG.ACC like know 
‘I like / know him? 
(58) Modern Standard Hindi (own data) 
mujhe vah  *usko acchà lagta hai / mila 
1SG.DAT 3SG 3SG.ACC good seem  PRS.3SG meet.PFV 


‘I like / met him. 


6 Some hypotheses regarding the origin of the marking 
and the markers 


6.1 Contact with substratum, adstratum and prestige language 


As already mentioned, DOM is part of the dozen features that are systematically con- 
sidered to define South Asia as a linguistic area, along with dative subjects, prevalence 
of complex predicates, coverbs, causative derivation, lack of ‘have’ verb, head final or- 
der, reduplication, etc. (Masica 1976; Emeneau 1980). Its appearance in Indo-Aryan is 
more or less contemporary with the rise of dative subjects: it has not been inherited 
from Sanskrit, a inflectional language where accusative is a structural case (all objects 
are case-marked, a purely syntactic phenomenon). On the contrary, the agglutinative 
Dravidian languages had, right from the first attested texts (slightly before the Chris- 
tian era), a DOM marking for human objects (suffix -ai), while it developed the Dative 
Subject pattern much later with a distinct suffix (Murugaiyan 2004), only slightly before 
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Indo-Aryan languages. Given the importance of structural borrowings from Dravidian 
in IA, such as the use of coverb and quotative, and the evidence of a Dravidian substra- 
tum in the area now occupied by Indo-Aryan speakers (Witzel 1995), Dravidian could 
be a plausible source for the IA marking. The behavior of the "accusative" suffix -ai in 
modern Tamil, however, is not constrained by transitivity since it can occur with dative 
subjects, as in (57), unlike in Hindi, as in (60):? 


(59) Tamil (own data) 


enakku  avar.ai pidikkum / teriyum 
LSG.DAT 3M.SG.ACC like know 
“I like / know him? 
(60) Modern Standard Hindi (own data) 
mujhe vah  *usko acchà lagta hai / mila 
LSG.DAT 3SG  3SG.ACC good seem  PRs.3sG meet.PFV 


‘I like / met him. 


Moreover, the wide time gap observed before the borrowing makes the hypothesis of 
a structural borrowing dubious. Similarly, Austro-Asiatic languages which also played 
a non-trivial role in the evolution of early Indo-Aryan (Witzel 1995), have always been 
around so that a sudden borrowing in the second millennium is little convincing. They 
consistently index human objects as well as beneficiaries on the predicate, but do not in- 
dex inanimates, whatever their syntactical function, since indexing is constrained by se- 
mantics, particularly the animacy and activity, and by the general grammatical structure 
as in semantically aligned languages. Moreover, they do not have differentially marked 
objects: 


(61) a. (in)  lel-jad-in-a-e 
(1sG) see-PST-1SG-V-3SG 
“He saw me' (V marks the predicative function, in a language with no 
noun-verb polarity) 
b. (in) om-am-tan-a-in 
(1sG) give-2sG-PRs-v-1sG 


T give (it/them) to you' 


Such features can only very indirectly be deemed responsible for new features in IA, 
whether DSM (Montaut 2013) or DOM, yet they may have acted as favoring factor. 

The other possible source in terms of contact is Persian, which came to be the domi- 
nant cultural and administrative language at the time when DOM became systematic in 
Hindi (16% c. onwards). Extremely influential in the renewal of the predicate lexicon by 
means of complex predicates (Montaut 2015), Persian, which extensively uses a marker 


?Note that Bengali also allows the accusative marker, even if the same as the dative marker, in experiential 
sentences such as Tamil (57), because experiential subjects are in the genitive in Bengali. 
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(rá) (originally a topic marker) for specific objects, is also sometimes credited to have 
triggered DOM in Hindi/Urdu. Krishnamurti et al. (1986: 143) observe that the develop- 
ment of DOM is more developed in the North Western IA languages than in central and 
Eastern ones, and conclude on a probable influence of Persian and more generally of 
central Asian languages. 

While none of these hypotheses fully explains the rise of DOM in Indo-Aryan - as ex- 
pected in keeping with its interpretation as a mainly discourse factor — the latter, allow- 
ing for a possible convergence with other substrata in the sub-continent, must definitely 
be taken into account. The origin of the new case markers has in contrast nothing to do 
with contact. 


6.2 The origin of the case markers 


Since the function is anterior to the morphological renewal of markers as seen in exam- 
ples (29b), (31) and (32) with inflectional forms in -hi (83.1), one can expect that some 
other case marker, already present in the language, extends its range of functions to 
the marking of certain objects, and that the dative is chosen for such an extension as 
for instance the Spanish preposition a. But the new Hindi marker appeared at the same 
time as the other case markers, continuing the oblique flexion of the earlier language, 
which was largely syncretic and not restricted to goals. It is obvious, however, that in 
all IA languages, although they display several distinct forms of markers for accusative, 
the same marker is now used for dative (including DSM) and marked accusative (Krish- 
namurti et al. 1986): the case meaning specialization (its syntactic function) came later 
than the marking itself of DOM, and the double use of a single marker as a dative and 
an accusative has a logic per se, which is found in too many languages in the world to 
be specific to the area. 

Now the question remains: why are there so many morphologically unrelated mark- 
ers for dative/accusative case in languages which are so closely related, in contrast with 
Dravidian languages, which all exhibit related forms? Marathi for instance has la, Gu- 
jarati has ne, Konkani, until recently considered a dialect of Marathi has -k; Hindi/Urdu 
has ko, Punjabi which is structurally extremely close to Hindi/Urdu and established a dis- 
tinct identity after the 16% c. has nú, Hindi “dialects” such as central Paharis (Garhwali, 
Kumaoni) have sani, Eastern Pahari, such as Nepali, has Iai. 

The basis used most extensively is la (le, lai, lai), ko (kau, kū, ku) or ne (nai, nē, nu), and 
neither of them, except là to a certain degree, derives from a clearly allative notion. The 
base for là and its reflexes for instance is generally derived since Beames (1970 [1875]) 
from the verbal root lag, meaning ‘touch’, ‘be stuck to” (although some scholars have 
suggested the verb labh ‘to get, obtain’ as an alternative derivation (Tiwari 1955). The 
regular path is as follows; lagya ‘having come in touch with’ > lage >lai, lai (le) ‘for the 


Eastern IA has other devices for marking specificity such as the so-called “article” or “classifier” -ta, which 
does not co-occur with the accusative marker as shown by Dasgupta (2015 (manuscript)). Besides, all Dardic 
languages, spoken in the North West of the South Asian area, have always shared features with Iranian 
languages, before the Mughal Empire which marked the entrance of Persian as a cultural language in 
Central India. 
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sake of”, “with the object of” (Juyal 1976). As for ko and its reflexes, it comes from the 
Sanskrit noun kaksa “side, place”, with intermediate forms closer to the original in cer- 
tain Pahari varieties (kakh, kakh, kakhá), initially a locative, which further developed a 
directional meaning, then became dative/accusative marker (Strnad 2013: 325). Similarly 
ne and its reflexes were initially locatives derived from a Sanskrit noun meaning “ear”: 
a shortened form of kanhai according to Tessitori (1914-1916), from *karnasmin (itself a 
reconstructed analogical locative of Sanskrit karne, the locative case of the noun “ear”), 
which is attested in Apabhramsha as kannahi and developed the meaning ‘aside, near’, 
then ‘towards, to’. Trumpp (1872: 401) also gives the original meaning ‘near’ for nai/ne, a 
derivation accepted by Tiwari (1955; 1966) and by Chatak (1980) who relates to it the al- 
ternate form kuni, frequent in Garwhali (central Pahari). The originally locative meaning 
is very clear in (62): 


(62) Old Rajasthani (Tessitori 1914-1916: 68-70) 
a. cardi nai nirmala nira 
road roc pure water 
“A limpid lake close by the road' 


a. avya rà kanhai 
come.M.PL king  LOC/ALL 
‘[They] went to the Raja (king) 


a. te ` savihü nai karaü paranam 
3PL alloBL Loc/ALL do.PRS.1SG salutation 


‘I bow to all of them (in front of/ for)’ 


The adessive/locative meaning still visible in (62a), is also the original meaning of the 
‘side’ base (at the origin of ko), and the main meaning of the ‘touch/be in contact’ base 
(origin of lai). As a matter of fact, the word “ear”, is according to Heine & Kuteva (2002: 
121), a very infrequent source for dative, and mentioned only as a source for locative. 

Other sources for DOM markers are even farther from a goal source or they are se- 
mantically totally empty: not common but not rare either (it is present in Sinitic lan- 
guages, cf. Chappell 2014), is the comitative source, which is found in markers such as 
Garhwali/Kumaoni sani (hani), from the Sanskrit noun sanga ‘society, company’, then 
‘with’, now the dative/accusative most usual marker. Other unusual markers, also used 
in Pahari languages, are tai, tai, derived from the locative of the indefinite tavati (tavahi, 
tamhi*taai, "ronnt, tai) ‘so long, so far, up to, till’, thai from the existential verb stha 
‘stand’, ‘exist’ and te/ti, from the present participle of the verb “be” in the locative (Sk 
bhavati > hontai, hunti). One marker has not yet been convincingly traced to a reliable 
origin, bai, be, dative/accusative marker in modern Kullui (Western Himalaya) as well 
as in Bundeli (South Madhya Pradesh). 

This be is perhaps related to the Garhwali/Kumaoni bati, used in these languages as an 
ablative, and derived from the verbal noun vartamana (from Sanskrit vrt ‘turn, expand’, 
then ‘what happens’, ‘present’). Ablative and goal obviously encode with opposite se- 
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mantic meanings, but similar "opposite" uses of case markers are extremely common 
across IA: te/ti is also used as an ablative in other northern dialects, ne is a frequent 
marker for ergative (Hindi/Urdu, Panjabi, Marathi) and le (a reflex of lai) is the Nepali 
and Kumaoni ergative marker. 

Even more striking is the fact that, in the very same language, the same marker may 
work as an ergative, an instrumental/ablative, and a dative/accusative, as is the case in 
Bangaru in both southern (63a) or northern (63b)-(63c) varieties: 


(63) Bangaru 
a. rupay ti us-ti le lo 
money ACC 3SG-ABL take  take.ImP 
“Take the money from him’ (Tiwari 1955: 177) 


b. kutte nae dande nae marya 
dog Acc stick ma strike.Ims 


“Strike the dog with the stick’ (Singh 1970) 


c. balka nae toriya honge 
child.M.PL ERG break  PRSUMPT.3M.PL 


"Ihe children have probably broken [it]' (Singh 1970) 


All the IA case markers are derived from words with such a vague semantic content 
that they are able to fulfill all casual functions, with the exception of the new locative 
mé/ma, even if in most languages they are now more or less specialized into broad func- 
tions. New functions (DOM, EXP) as well as inherited ones (ERG, DAT, INS) selected 
any of the available markers when case marking shifted from the old inflections, by then 
much eroded and syncretic, to the new postpositional system during the first part of sec- 
ond millennium. But, interestingly, none developed a specific marker on the Dravidian 
or Persian model, and none selected a DOM marker distinct from the DAT one. 


7 Conclusions 


As a result of the identical case-marking for dative and accusative, experiential subjects 
and marked objects are similarly encoded, and the rise of DOM and DSM is chronolog- 
ically very comparable: starting with only sporadic non-consistent occurrences during 
the 14'? c. and getting systematic and consistent after the 17% c. Is it an argument for 
making both processes complementary as suggested by Aissen (2003)? This is highly 
controversial since experiential subjects are strictly constrained by the lexical seman- 
tics of the predicate (and to a certain degree by its morphology since it occurs almost 
exclusively in Hindi with complex predicates), whereas marked objects obey discourse 
constraints. Specificity can be considered the more important triggering factor for DOM, 
yet in order to account for those alternations which at first glance seem to be syntacti- 
cally constrained (82.3 and 83) another factor is required, namely discourse saliency. This 
is not incompatible with Dalrymple & Nikolaeva's (2011) notion of secondary topicality, 
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nor with the prominence involved in the twin scales of animacy and specificity, yet it 
also allows us to account for examples where unmarked objects are in a topicalized po- 
sition and vice-versa. Not surprisingly, the first constraints which emerged during the 
diachronic evolution of the structure are neither animacy nor specificity but discourse 
prominence, of which prosodic requirements can be considered an auxiliary. Besides, 
the existence of a threefold distinction between objects (‘incorporated’, unmarked and 
accusative-marked) in nominalizations, depending on their individuation, has no equiv- 


alent for subjects. 


Abbreviations 

1 first person INS instrumental 
second person INTR intransitive 
third person M masculine 

ACC accusative NEG negation, negative 

ALL allative NON non- 

COND conditional NOM nominative 

cv coverb OBL oblique 

DAT dative PASS passive 

DEF definite PFV perfective 

DEM demonstrative PL plural 

DET determiner POSS possessive 

ERG ergative PPRF pluperfect 

EZ ezafe PRF perfect 

F feminine PROG progressive 

FOC focus PRS present 

FUT future PRSUMPT  presumptive 

GEN genitive PST past 

H human REFL reflexive 

HON honorific REL relative 

IMP imperative SG singular 

INDEF indefinite TOP topic 

INF infinitive v predicative function 
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The diachronic development of 
Differential Object Marking in Spanish 
ditransitive constructions 


Klaus von Heusinger 


Universitát zu Kóln 


Differential Object Marking (DOM) in Spanish synchronically depends on the referential 
features of the direct object, such as animacy and referentiality, and on the semantics of 
the verb. Recent corpus studies suggest that the diachronic development proceeds along 
the same features, which are ranked in scales, namely the Animacy Scale, the Referentiality 
Scale and the Affectedness Scale. The present paper investigates this development in ditran- 
sitive constructions from the 17th to the 20th century. Ditransitive constructions in Spanish 
are of particular interest since the literature assumes that the differential object marker a is 
often blocked by the co-occurrence of the case marker a for the indirect object. The paper 
focuses on the conditions that enhance or weaken this blocking effect. It investigates three 
types of constructions with a ditransitive verb: (i) constructions with indirect objects real- 
ized as a-marked full noun phrases, (ii) constructions with indirect objects as clitic pronouns, 
and (iii) constructions with non-overt indirect objects. The results clearly show that DOM is 
more frequent with (iii) and less frequent with (i). Thus the results support the observation 
that the co-occurrence of an a-marked indirect object (partly) blocks a-marking of the di- 
rect object to a certain extent. Furthermore, the results show for the first time that indirect 
objects realized as clitic pronouns without the marker a have a weaker blocking effect, but 
still a stronger one than constructions without overt indirect objects. In summary, the paper 
presents new and original evidence of the competition between arguments in a diachronic 


perspective. 


1 Introduction 


Differential Object Marking (DOM) in Spanish is realized by the marker a, which is de- 
rived from the preposition a ‘to’ and which is also used to mark the indirect object. DOM 
in Spanish depends on referentiality, animacy and affectedness (see Pensado 1995; Brugé 
& Brugger 1996; Leonetti 2004; von Heusinger & Kaiser 2007). The a-marking of the direct 
object can easily co-occur with the prepositional a, but in ditransitive constructions with 
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a-marked indirect objects, the a-marking of the direct object can or must be dropped. In 
this paper I focus on the development of DOM in Spanish ditransitive constructions. 
While the development of DOM in transitive constructions is well-investigated (see 
Melis 1995; Laca 2002; 2006; von Heusinger 2008), there are very few studies that inves- 
tigate competition of the marker a between the direct object and the indirect object (but 
see Company Company 1998; 2002; Ortiz Ciscomani 2005; 2011; Rodríguez-Mondoñedo 
2007). I will provide a qualitative corpus search, complementing the investigation of 
Ortiz Ciscomani and providing new material to discuss the relation between the devel- 
opment of a-marking in transitive sentences with the one in ditransitive sentences. I 
take the result to support the view that DOM in ditransitive constructions has devel- 
oped similarly to DOM in transitive constructions, but that both, an indirect pronoun 
and an indirect full noun phrase, reduce the number of DOM for direct objects. 

In contemporary Spanish, a human definite direct object in a transitive construction 
must be marked by the differential object marker a as illustrated in (1). The a-marked 
definite direct object can co-occur with a prepositional object marked by a, as in (2), 
but is generally blocked or disfavored by the occurrence of an a-marked indirect object 
realized by an a-marked full noun phrase in a ditransitive construction, as in (3). The 
co-occurrence of an a-marked direct object and an a-marked indirect object is subject 
to controversial grammaticality judgments, cf. (4) - judgments according to Company 
Company (2001: 20). 


(1) Busco al / "el médico. 
seek.1IsG DoMw.the / the doctor 


“Tam seeking the doctor. 


(2) Envié a mi hermana a Caracas. 
sentlsG DOM my sister to Caracas 


‘I sent my sister to Caracas. 


(3) El | maestro presentó Ø su mujer a los alumnos. 
the teacher  introduced.3sc his wife to the students 


"[he teacher introduced his wife to the students: 


(4) ??"El maestro presentó a su mujer a los alumnos. 
??lthe teacher introduced.3sG pom his wife to the students 


“The teacher introduced his wife to the students: 


There is a controversy about the effect of clitic doubling of the indirect object. Accord- 
ing to certain grammatical conditions, indirect objects can or must be doubled by a clitic 
(pronoun) form that agrees in case and number with the indirect object (Campos 1999; 
Gabriel & Rinke 2010). There are at least three positions on the effect of clitic doubling in 
ditransitive constructions: it facilitates a-marking of the direct object, it favors blocking 
of a-marking, or it makes a-marking ungrammatical. (i) Company Company (1998; 2002) 
claims that the clitic le in (5) facilitates the a-marking of the direct object. (ii) Rodríguez- 
Mondoñedo (2007: 216) claims that *[...] clitic-doubled IOs seem to allow the dropping 
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[of the a marker] more easily than their non-doubled counterparts, at least for some 
speakers [...].” (iii) Fábregas (2013: 31) reports that a-marking of the direct object is more 
grammatical without clitic than with clitic, as in (6). Ormazabal & Romero (2013: 224) 
also assume that clitic doubling bans a-marking of the direct object. 


(5) El maestro le presentó a su mujer a Juan. 
the teacher DAT.3sG_  introduced.3s6 DOM his wife to Juan 


“The teacher introduced his wife to Juan’ (judgement according to 
Company Company 2001: 20) 


(6) “Le enviaron a todos los heridos a la doctora. 
DAT.3SG  sent3PL DOM all the injured to the doctor 


“They sent all the injured to the doctor’ (judgement according to Fábregas 2013: 
31) 


The diachronic development of DOM in Spanish is fairly well documented and investi- 
gated primarily in transitive construction (see Melis 1995; Melis & Flores 2009; Laca 2002; 
2006; von Heusinger & Kaiser 2007; von Heusinger 2008). Diachronic data of ditransitive 
constructions with two full noun phrases are rare and therefore difficult to collect, but 
the examples below provide some interesting observations. Already in ditransitive con- 
structions in the 13th century, an alternation between a-marked direct objects (7) and 
unmarked direct objects (8) can be seen. 


(7) E dio Ercules a  Manilop a la reyna  Anthipa, su 
and gave.35G Hercules to Manilop Dom the queen Anthipa his 
Hermana. 
sister 


“And Hercules gave his sister, Queen Anthipa, to Manilop. (GEII (General Estoria, 
segunda parte), 21, 13th century, quoted after Ortiz Ciscomani 2011: 167) 


(8) El dio Ø sus fijas a aquellos dos infantes ante 
he gave.3sc his daughters to those two infants  in.front.of 
todos sus ricos omnes. 
all his rich men 


‘He gave his daughters to those two princes in front of all his rich men? (GE 
(General Estoria), 344, 13th century, quoted after Ortiz Ciscomani 2011: 168) 


One also finds this alternation in sentences with clitic-doubled indirect objects: The 
direct object Leonor is a-marked in (9), while the direct object media mujer (half woman’) 
is unmarked in (10) (examples from the 17th century): 


(9) A Mendo, hij de hermana menor, le quiero dar 
to Mendo son of sister younger DAT.38G want.1sG  give.INF 
a Leonor. 


DOM Leonor 


“To Mendo, son of (my) younger sister, I want to give Leonor: 
(Moreto, Agustín. (1618-1669), El lindo Don Diego) 
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(10) Aun si les dieran Ø media mujer a cada uno, 
even if DAT.3PL gave.3PL half woman to each one.MASC 
fuera menor el daño. 
would.be.3sc less the damage 


'Even if they gave half a woman to each one (of them), the damage would be less: 
(Castro, Guillén de. (1569-1631), El conde de Irlos.) 


We can summarize the observations regarding DOM in transitive and ditransitive con- 
structions. DOM in transitive constructions in Spanish is well-investigated: Synchroni- 
cally specific indefinite human direct objects are obligatorily marked, non-specific ones 
are optionally marked, and non-human direct objects are nearly never marked (Brugé & 
Brugger 1996; Leonetti 2004; von Heusinger & Kaiser 2007; García García 2014). DOM is 
blocked or less often used in ditransitive constructions with the indirect object realized 
by a full noun phrase with the dative case marker a. There is variation in diachronic data, 
but so far the relevant parameters for this variation, if any, cannot be identified. 

There are various theories of DOM with different emphasis on syntactic, semantic or 
functional properties of the a-marker. For the sake of the argument (and broadly simpli- 
fying), I assume four positions, which do not necessarily exclude each other: (a) DOM 
as a case marker, (b) DOM in competition with indirect object case marking, (c) DOM 
indicates the syntactic status of a noun phrase as an argument, (d) DOM as a means to 
disambiguate between subject and object. (a) It is often assumed that DOM is the case 
marker of the direct object, which is shown by the dependency on certain syntactic con- 
structions, such as small clauses (Brugé & Brugger 1996; Rodríguez-Mondoñedo 2007; 
Ormazabal & Romero 2013). Such a syntactic perspective predicts a certain stability of 
the phenomenon and a clear prediction following general case principles (only one case 
assignment in a clause). (b) Company Company (1998; 2002) argues that direct objects 
are marked by DOM, while there are different means in addition to the a-marking to 
mark an indirect object (such as clitic doubling) - see also Delbecque (1998; 2002) for a 
construction grammar approach. In the history of Spanish, there has been continuous 
competition between these two strategies. DOM is a strategy for marking direct objects, 
and becomes unavailable when it creates an ambiguity with indirect objects. If, how- 
ever, there are no other means available, a-marking is reserved for the indirect object 
and cannot be simultaneously used for the direct object. This picture provides an account 
of some of the diachronic data, but does not always seem to be confirmed by synchronic 
data (see Melis & Flores 2009 for discussion). (c) Synchronically, it is assumed that DOM 
signals that the direct object is a proper argument that saturates the verbal frame, while 
unmarked direct objects are more like bare nouns that modify the verb (Chung & Ladu- 
saw 2004; López 2012). This view predicts a certain stability in similar semantic contexts. 
It is, however, not clear how this view can account for the diachronic data, in particular 
the observation that in earlier stages of Spanish, DOM was only obligatory for pronouns 
and proper names, but not for definite noun phrases. Still, definite noun phrases are 
arguments in Chung & Ladusaw's (2004) account and should be a-marked according to 
López (2012). (d) Functional theories assume that one of the main functions of DOM is to 
identify a direct object, if it is too similar to the subject, i.e. if it has too many properties 
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of prototypical subjects. Besides this main function, DOM can additionally express other 
semantic or pragmatic features, such as topicality, referentiality or specificity (Comrie 
1975; Bossong 1985; Aissen 2003) or telicity (Torrego Salcedo 1999) or affectedness (von 
Heusinger & Kaiser 2011). DOM is often overextended and conventionalized (grammat- 
icalized), i.e. used in contexts where a distinction between subject and object is already 
given by other means (e.g. verbal agreement; see Aissen 2003 for discussion). The func- 
tional view seems to be flexible enough to model diachronic change, and it predicts a 
certain variability in the actual realization of DOM. In this paper I cannot answer the 
question which of the four positions is the most appropriate one. I rather provide addi- 
tional observations that might support one or the other account. 

The main focus of this paper is to compare the development of DOM in transitive 
constructions with the development in ditransitive constructions. I have restricted the 
data to direct objects realized by human noun phrases, i.e. definite NPs and indefinite 
NPs. For transitive constructions, I will use the material presented in the literature (Melis 
1995; Laca 2006; von Heusinger & Kaiser 2007; 2011; von Heusinger 2008) and compare 
this with the data of Ortiz Ciscomani (2005; 2011). I have also created my own corpus, 
including three realizations of ditransitive constructions, which all have a direct object 
realized as a human definite or indefinite noun phrase (but not all subjects are realized 
or realized as full noun phrases): In type (i), the indirect object is not realized - either 
because the indirect object is inferred from the context or because it is left unspecified. 
Type (ii) realizes the indirect object as a clitic pronoun - generally before the finite verb. 
Type (iii) realizes the indirect object as full noun phrase that is obligatorily marked by a 
(see Table 1). 


Table 1: Types of constructions and argument realizations 


Example IO 
(i) El maestro presentó (a) su hijo not realized 
(ii) El maestro le presentó (a) su hijo clitic pronoun 
(iii) El maestro presentó (a) su hijo al alumno full NP 


“The teacher introduces his son (to him, to the student). 


I put forward the following hypotheses, which will be tested using data extracted from 
diachronic corpora: 
e HI: The type of the ditransitive construction determines the blocking effect: 


i constructions with indirect objects realized as a-marked full noun phrases 
(definite NPs, indefinite NPs) show a high blocking effect 


ii constructions with indirect objects as clitic pronouns show a low blocking 
effect, and 


iii constructions with non-overt indirect objects do not show any blocking ef- 
fect 


319 


Klaus von Heusinger 


e H2: DOM in ditransitive constructions has a comparable development to DOM in 
transitive constructions. 


* H3: Verb classes differ with respect to the way they influence DOM and DOM- 
blocking. 


In 82 I summarize the synchronic and diachronic conditions for DOM in Spanish. $3 
presents the synchronic restrictions on DOM in ditransitive constructions. $4 summa- 
rizes earlier research on ditransitives in Spanish (Company Company and Ortiz Cisco- 
mani), introduces the corpus created for this paper, and discusses the results of the cor- 
pus search. $5 provides the evaluation of the results with respect to the three hypotheses 
and a general discussion of DOM in ditransitive construction. 


2 DOM in transitive constructions 


2.1 Synchrony of nominal and verbal parameters related to DOM 


I will limit the investigation to European Spanish throughout this paper, but see Com- 
pany Company (2002) for Mexican Spanish. It is commonly assumed that there are at 
least four main factors for DOM in the languages of the world: (i) animacy properties 
of the direct object; and (ii) referential properties, such as indexicality (deixis), definite- 
ness and specificity, of the direct object. The referentiality status is clearly indicated by 
the morphological form of the noun phrase and ordered on the Referentiality Scale (see 
below (14)). (iii) Information structure might determine DOM, in particular topical di- 
rect objects tend rather to be marked than not. (iv) Finally, transitivity properties of the 
verb also influence DOM (see Comrie 1975; Bossong 1985; Aissen 2003; de Swart 2007; 
Iemmolo 2010; lemmolo 8 Klumpp 2014; Witzlack-Makarevich & SerZant 2018). DOM or 
a-marking in Spanish is determined by all four main parameters: 

(i) Only human direct objects can be marked, while non-human (animate) and inan- 
imate direct objects are obligatorily unmarked. However, there is small class of verbs, 
such as verbs of substitution, that allow DOM for inanimate direct objects (see García 
García 2014, 2018 [this volume] for an extensive discussion), cf. (13). In the remainder, I 
will exclude inanimate direct objects as I am not aware of ditransitive constructions that 
allow DOM for inanimates. 


(11) Conozco "(a) este actor. 
know ise pom this actor 


T know this actor. 


(12) Conozco (a) esta película. 
know.1.sG this film 


‘I know this film? 
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(13) En esta receta la leche puede sustituir "el/al huevo. 
in this recipe the milk can substitute the/pom.the egg 


‘In this recipe the milk can substitute the egg’ 


(ii) Specific indefinite human direct objects and all direct objects that are higher on 
the Referentiality Scale (14) must be a-marked, cf. (15). Even non-specific indefinites can 
optionally be a-marked, cf. (16), where the subjunctive sepa (‘might know’) of the relative 
clause indicates that the head noun un ayudante (‘an assistant’) is non-specific. Deter- 
minerless noun phrases (‘bare nouns’ in their ‘non-argumental’ function) as camarero 
(‘waiter’) in (17) must not be a-marked. 


(14) Referentiality Scale: 
personal pronoun > proper noun > definite NP > specific indefinite NP 
» non-specific indefinite NP » non-argumental 


(15) Vi "(a) la/una mujer. 
saw.1sG DOM the/a woman 


‘I saw the / a woman? 


(16) Necesitan (a) | un ayudante que sepa inglés. 
need.3PL DOM an assistant that know.3sG English 


“They need an assistant who knows English? 


(17) Necesitan (“a) camarero. 
they.need waiter 


“They need a waiter. 


(iii) Topicality is also often said to be a parameter of DOM in Spanish. Like in many 
other DOM languages, leftwards-moved direct objects are obligatorily a-marked, cf. (18), 
see Leonetti (2004: 86). It is, however, much harder to argue that non-moved a-marked 
noun phrases are topical. lemmolo (2010) argues that such noun phrases show certain 
properties of topics and links DOM to topichood, while Dalrymple & Nikolaeva (2011) 
assume that DOM indicates a secondary topic, as a direct object is rarely the primary 
sentence topic.! 


(8) "(A) muchos estudiantes, ya los conocía. 
*“(DOM) many students, already them  knew.1sc 


"Many students I already knew: 


(iv) Verbal categories are also decisive for DOM in Spanish. Bello (1847: 567-570) and 
Fernández Ramírez (1951: 151-190) present rich material on the variation according to 
different verb types in Spanish. Pottier (1968: 87) proposes the scale in (19) for a-marking 


1See also Chiriacescu (2014) and Guntsetseg (2016) for the function of DOM as a secondary topic in Roma- 
nian and Mongolian, respectively. 
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in Spanish, which is slightly modified by von von Heusinger & Kaiser (2007: 94) to the 
Scale of Affectedness and Expected Animacy, cf. Table 2 (see also von Heusinger & Kaiser 
2011 for a different affectedness categorization, based on Tsunoda 1985). 


(19) Verbal Scale (Pottier 1968: 87: “un axe sémantique verbal”) 


matar ‘kill’ > ver ‘see’ > considerar ‘consider’ > tener ‘have’ 


Table 2: Scale of Affectedness and Expected Animacy (von Heusinger € Kaiser 
2007: 94) 


Class 1 [+ human] > Class 2 [+ human] > Class 3 [(+)/- animate] 


matar ‘kill’, herir ‘hurt’? ` ver ‘see’, hallar find” tomar ‘take’, poner ‘put’ 


The scale in Table 2 predicts that verbs like matar (‘to kill’), which clearly prefer a 
human direct object, are much more likely to mark the direct object than verbs that do 
not show such a preference, such as ver (‘to see”). Verbs that prefer an inanimate direct 
object show synchronically the lowest rate of a-marking of their human direct objects. 


2.2 Diachrony of NP-related properties 


Like Modern Spanish, Old Spanish exhibits DOM. However, as shown in several di- 
achronic studies (Melis 1995, Laca 2006), DOM in Old Spanish is less frequent than in 
Modern Spanish and used under different conditions. Human definite direct objects are 
optionally a-marked, as the two examples in (20)-(21) illustrate. Non-human animate 
indefinite direct objects are generally not a-marked, as in (22). 


(20) Old Spanish (Cid, 2637) 
Reciba a mios  yernos commo elle pudier mejor. 
receive.IMP.28G DOM my sonsinlaw as he  could.3sc better 


“Have him welcome my sons-in-law as best he can. 
(21) Old Spanish (Cid, 2956) 


Ca yo case sus fijas con yfantes de Carrion. 
for I  marriedisc his daughters with Infantes of Carrion 


‘for I married his daughters to the Infantes of Carrion’ 


(22) Old Spanish (Cid, 480-481) 


Tanto traen las grandes ganancias, muchos gañados de 
very  brought.3PL the big wealths many herds of 
ovejas e de vacas. 


sheep and of cows 


"Ihey brought such great wealth, many herds of sheep and cows? 
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Table 3 summarizes the findings of Laca (2006), which is based on the manual collec- 
tion of utterances in her corpus of documents from the 12th to the 19th century. Proper 
names are a-marked from the time of Old Spanish, while definite and indefinite NPs 
show a clear development. Non-human direct objects are rarely marked. 


Table 3: Diachronic development of a-marking in Spanish according to the 
Referentiality Scale (selection from Table 3 of Laca 2006: 442). I replaced the 
original abbreviations in the following way: NPrHum: human proper name, 
HumDef-Pro: human definite NP, HumInd-Pro: human indefinite NP, Humo: 
human bare noun 


century 
12th 14th 15th 16th 17th 18th 19th 


proper name 96%(26) 100%(8) 100%(35) 95%(44) 100%(65) 79%(29) 89% (27) 


definite NP 36% (36) 55%(66)  58%(65) 70% (122) 86%(136) 85%(53) 96% (76) 


) ) 
indefinite NP ` 0%(6) 6% (31) 07, (1) 12% (59) 39% (53) 62% (32) 41% (29) 
bare noun 0% (12) 0% (7) 16% (12) 5% (40) 2% (39) 9% (22) 6% (17) 


Figure 1 presents Laca’s data in a graphic that illustrates that the rate of a-marking 
has increased over time and along the Referentiality Scale. 
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HH proper noun —* -definite NP -*--indefinite NP — * bare noun 


Figure 1: Diachronic development of a-marking in Spanish according to the 
Referentiality Scale (based on Laca 2006: 442, Table 5; from von Heusinger & 
Kaiser 2011, Fig. 3) 
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2.3 Diachrony and affectedness 


Von Heusinger & Kaiser (2007) apply the Scale of Affectedness, cf. Table 2, to a small 
corpus from the Bible to show the diachronic development along this scale. The corpus 
consists of the two books of Samuel and the two Books of Kings in three Bible transla- 
tions, abbreviated as A-C: translation A is from the 14th century and was only available 
as a printed version. All other translations were electronically available at Biblegate: 
B, Reina Valera Antigua from 16th/17th century, its contemporary version C from 1995 
(Reina Valera). (23) nicely illustrates the development and its interaction with topicaliza- 
tion. The verb tomar (‘take’) is from Class 3, i.e. from those verbs that strongly prefer 
an inanimate direct object. In the translation from the 14th century, the direct object a 
vuestra fijas (‘your daughter’) is a-marked, since it is left-moved, while the direct object 
in the translation from the 16th century is not moved and unmarked. However, the trans- 
lation from the 20th century provides DOM for the direct object in the base position, as 
expected for definite human noun phrases. 


(23) 1Samuel 8, 13 

A(14th) Ea vuestras fijas tomará por especieras e cosineras e panaderas. 

B(16th)  Tomará también Ø vuestras hijas para que sean perfumadoras, 
cocineras, y amasadoras. 

C(20th) Tomará también a vuestras hijas para perfumistas, cocineras y 
amasadoras. 

English “He will take (A: pom, B: Ø, C: DOM) your daughters to be 
perfumers, cooks and bakers’ 


In a detailed analysis, von Heusinger & Kaiser (2007) searched the small corpus for 
all instances of definite and indefinite noun phrases that filled the direct object of the 
following six verbs categorized in three classes: Class 1: matar ‘kill’, herir ‘hurt’, Class 
2: ver “see”, hallar “find”, and Class 3: tomar ‘take’, poner ‘put’, cf. Table 2. These classes 
differ not so much in affectedness of the direct object, but rather in the expectedness 
of animacy of the direct object. Class 1 has a very high expectation that the object is 
human, while class 2 is rather neutral, and class 3 has an expectation of an inanimate 
direct object. Table 4 provides the figures for human definite direct objects and Table 5 
for human indefinite direct objects.? 

Figure 2 summarizes the two tables and clearly shows that referentiality is the main 
parameter for DOM: Definite direct objects are more often a-marked than indefinite di- 
rect objects. Furthermore, the verb class is a crucial parameter for DOM. Both parameters 
add up (there is no interaction). 

Von Heusinger (2008) provides a corpus search to more precise historical periods, us- 
ing Mark Davies’ Corpus del Español. The corpus comprises 100 million words of Spanish 
texts from the 12th to the 19th century. The corpus interface allows one to search for lem- 
mas, rather than for word forms (as in simple text files of the Bible texts). However, such 


2 An alternative view is that not the animacy, but the agentivity of the direct object is the relevant parameter 
for DOM (see García García 2014 for this view). 
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Table 4: Percentage of a-marking of human definite direct objects. (Bible trans- 
lations of 1-2 Samuel and 1+2 Kings, from von Heusinger & Kaiser 2011: 606) 


class 14th cent. 16th/17th cent. 20th cent. 

1. matar ‘kill’, herir ‘hurt’ 60% (24/40) 66% (37/56) 92% (36/39) 
2. ver ‘see’, hallar ‘find’ 38% (9/24) 48% (13/27) 81% (26/32) 
3. tomar ‘take’, poner ‘put’ 30% (7/23) 30% (7/23) 67% (20/30) 


Table 5: Percentage of a-marking of human indefinite direct objects. (Bible 
translations of 1+2 Samuel and 1+2 Kings, from tvon Heusinger & Kaiser 2011: 


607) 
class 14th cent. 16th/17th cent. 20th cent. 
1. matar ‘kill’, herir ‘hurt’ 7% (1/14) 79 (1/14) 91% (10/11) 
2. ver ‘see’, hallar ‘find’ 0% (0/11) 15% (2/13) 45% (5/11) 
3. tomar ‘take’, poner ‘put’ 0% (0/15) 0% (0/28) 17% (2/12) 


100% 7 


80% + 


60% + 


40% | 


20% | 


0% 4 


A: 14th cent. B: 16th/17th cent. C: 20th cent. 
—9-1: matar/herir + def DO —® 2: ver/hallar + def DO 3: poner/tomar + def DO 
-@-1: matar/herir + indef DO ~-®-2:ver/hallar+indefDO 73:3: poner/tomar + indef DO 


Figure 2: Percentage of a-marking depending on verb class, definiteness and 
time; Class 1: matar ‘kill’, herir ‘hurt’, Class 2: ver ‘see’, hallar ‘find’, and Class 3: 
tomar ‘take’, poner ‘put’ (Three Bible translations of 1+2 Samuel and 1+2 Kings, 
from von Heusinger & Kaiser 2011: 607) 
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searches are still very time-consuming since one has to select the definite or indefinite 
human direct objects by hand. In the case of tomar only about 1-7% of all hits were hu- 
man definite or indefinite full NPs. The others were either inanimate, or human and of 
a different type on the Referentiality Scale, such as clitics, personal pronouns, proper 
names or different types of quantifiers. The study originally differentiates between eight 
time periods from the 12th to the 19th century, which have been reduced to four time 
periods. Furthermore, the search was restricted to two verb classes, and one verb for 
each class: matar “to kill’ for class 1 and tomar ‘to take’ for class 3 (see von Heusinger 
2008 for the details, and von Heusinger 8 Kaiser 2011 for a compact presentation). Ta- 
ble 6 shows that in the 12th and 13th century, 50% of human definite direct objects of 
matar are marked with a. This number continually increases and reaches about 90 per- 
cent by the 18th and 19th century. The marking of the definite direct object of tomar is 
less preferred. Only about 40% in the 12th and 13th century are marked, a number that 
continuously increases to about 80% in the 18th and 19th century. Table 7 provides the 
numbers for human indefinite direct objects. As expected, a-marking is less preferred, 
but there is a clear increase over time and some difference between the two verb classes. 


Table 6: Percentage of a-marking of human definite direct objects. (Corpus del 
Español, from von Heusinger & Kaiser 2011: 608) 


class 12th + 13th cent. 14th + 15th cent. 16th + 17th cent. 18th + 19th cent 


1. matar ‘kill‘ 50% (25/50) 63% (27/43) 78% (32/41) 91% (39/43) 
3. tomar ‘take’ 40% (38/95) 55% (30/55) 70% (7/10) 83% (20/24) 


Table 7: Percentage of a-marking of human indefinite direct objects (Corpus 
del Español; from von Heusinger & Kaiser 2011: 608) 


class 12th + 13th cent. 14th + 15th cent. 16th + 17th cent. 18th + 19th cent 


1. matar ‘kill‘ 5% (2/42) 8% (3/40) 15% (6/40) 37% (16/43) 
3. tomar ‘take’ 3% (1/34) 4% (2/47) 11% (1/9) 23% (7/31) 


Figure 3 compares the development of a-marking for definite and indefinite human 
direct objects for the two verbs. It shows three points: (i) a-marking in Spanish increases 
over time; (ii) it depends on the Referentiality Scale as human indefinite direct objects 
show less preference for DOM than definite ones; (iii) there is a tendency for a-marking 
to depend on the verb class, i.e. on the preference of the verb for the animacy of the direct 
object. Note that only human direct objects were counted, which means that there are 
two independent parameters: first the actual animacy of the direct object and second the 
preference of the verb for the animacy of the direct object. 
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Tr ooo n 
12th+13th cent. 14th+15th cent. 16th+17th cent. 18th+19th cent. 
—*— |: matar + def DO 7-3: tomar + def DO 
—9— |: matar + indef DO ----3: tomar + indef DO 


Figure 3: Percentage of a-marking depending on verb class, definiteness and 
time; Class 1: matar ‘kill’ and Class 3: tomar ‘take’ (Corpus del Español; from 
von Heusinger & Kaiser 2011: 606) 


3 Blocking of DOM in ditransitive constructions 


As mentioned above, DOM in Spanish is realized by the marker a, which is also used 
for marking the indirect object and this marker also represents the main preposition 
for direction - the marker derives from Latin ad ‘to’, which can clearly be seen in its 
prepositional use. The marker a is differentially used for the direct object and obliga- 
torily for the indirect object. I assume that a-marked direct objects are not datives, but 
accusatives as shown by the following criteria: a) passivation, cf. (24); b) the replacement 
by the pronoun lo for masculine and la for feminine, cf. (25); and c) the doubling of a 
leftwards-moved direct object by a clitic pronoun lo or la, cf. (26) (Campos 1999). 


(24) 


(25) 


a. Ema y Tito observaron a Ana. 
Ema and Tito observe.PsT.35G DOM Ana 


“Ema and Tito observed Ana? 


b. Ana fue observada por Ema y Tito. 
Ana was observed by Ema and Tito 


‘Ana was observed by Ema and Tito’ 


a. A: ¿Viste a Kiko? 
see.PST.28G DOM Kiko 
‘Did you see Kiko?’ 


327 


Klaus von Heusinger 


b. B: Sí lo vi. 
Yes, ACC.3SG  see.PST.1SG 


“Yes, I saw him? 


(26) A Claudito lo vi por primera vez en diciembre. 
DOM Claudito acc.3sG_  see.PsrisG for first time in December 


“Claudito, I saw him for the first time in December? 


The indirect object in the dative is defined by the impossibility to form a passive, cf. 
(27)-(28) and the replacement by a clitic pronoun le in the singular and les in the plural, 


(cf. (29)).3 


(27) Juan (le) dio una limosna a nuestro vecino 
Juan (DAT.3SG) give.PST.35G a charity DAT our neighbor 
ayer. 
yesterday 


‘Juan gave our neighbor a charity yesterday’ 


(28) “Nuestro vecino fue dado una limosna. 
intended reading: Our neighbor was given a charity. 


(29) Juan regalo un libro a Maria, y Pablo le 
Juan  presentpsr.3s06 a book to Maria, and Pablo DAT.3SG 
regaló flores. 


present.?sr3sG flowers 


Juan presented a book to Maria and Pablo presented her flowers. 


3The following assumes non-leísta varieties of Spanish. Spanish grammars describe as ‘leista’ varieties the 

use of Spanish where the form le stands for the direct object (instead of lo, la) as in (i) (under certain 
conditions — depending on the leísta-type). The verb conocer ‘to know’ takes a direct object, in the first 
sentence the a-marked direct object a Juan. In the second sentence, non-leista varieties would use the 
accusative pronoun lo, while leísta varieties take le (for the accusative). 


(i) ¿Conoces a Juan? Si, le conozco hace tiempo. 
know.2sG pom Juan. Yes, Acc.3sG know.lsG_ since time 


“Do you know Juan? Yes, I know him since some time: 


In general, the question of leísta-varieties should not interfere with the question of a-marking of the 
direct object since only definite indirect objects are clitic doubled, but not direct objects (in most varieties 
of Spanish). Thus the clitic (pronoun) le in (ii) can only double the indirect object al alumno (the student), 
but not the direct object su hijo (“his son”), which has optionally DOM, see Fernández-Ordoñez (1999). 


(i) El maestro le presentó (a) su hijo al alumno. 
the teacher DAT.3SG  present.PsT.3sG (DOM) his son  tothe student 


‘The teacher presented his son to the student’ 
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Finally, a preposition introduced by a, as in (30) can never be replaced by a clitic 
pronoun le. Rather it must be picked up by a locative expression. 


(30) María viaja a París y Ana “le / allá viaja, 
María travel.3sc to Paris, and Ana “patT.3sG / there  travel-3sc, 
también. 
too 


“María travels to Paris and Ana travels there, too? 


To summarize, the form a is used for marking the direct object (and then glossed as 
DOM), for marking the indirect object (optionally glossed as ‘to’ or ‘DAT’) and as a regular 
preposition “to”. One can clearly distinguish between the different functions. 


3.1 DOM and clitic doubling of indirect objects 


According to Campos (1999: 1548), there are two classes of indirect objects, goals and 
benefactive: goals stand with predicates of movement or transferring, while benefactives 
cover indirect objects that are included in the event described by predicates of creation, 
destruction, ingestion or preparation. For goal datives, clitic doubling is optional, cf. (31); 
for benefactives, clitic doubling is obligatory, cf. (32). 


(31) Lola (le) dio el jugete a _ Pablo. 
Lola (DAT.35G) give.PsT.33G the toy to Pablo 
“Lola gave the toy to Pablo’ (CInd) 

(32) Lola *(le) rompió el jugete a Pablo. 
Lola break.Psrsc the toy DAT Pablo 
‘Lola broke Pablo's toy? (CInd) 


Campos (1999: 1554) also quotes the grammar of the Real Academia Espanola (RAE 
1973: 83.4.6), which states that DOM may be dropped in order to disambiguate. 


(33) Presentaron O la hija a los invitados. 
introduce.PsT.3sG the daughter to the guests 
"Ihey introduced the daughter to the guests: 


According to Campos, the simultaneous use of the marker a for the DO and IO be- 
comes ungrammatical when a dative clitic doubles the indirect object (34) (Campos 1999: 
1554, fn. 79): 


(34) "Les presentaron a la hija a los invitados. 
DAT.3PL introduce.PST.3PL DOM the daughter to the guests 


"They introduced the daughter to the guests’ (Campos 1999: 1554, fn. 79) 
There is extensive literature on clitic doubling in Spanish (or more generally in Ro- 
mance languages). There are also studies on the development of clitic doubling in Span- 


ish, I cannot do justice to all of them, but see Fontana (1993); Fischer & Rinke (2003); 
Gabriel & Rinke (2010); von Heusinger (2017). 
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3.2 Causative constructions 


López (2012: 24) observes that in causative constructions the human indefinite causee of 
an intransitive verb, as in (35), is accusative and a-marked according to its referentiality 
status as specific. It is accusative since it cannot be doubled by a clitic in the dative and if 
it were inanimate it would not be marked by a. If the complement of a causative predicate 
isatransitive verb, the causee is obligatorily a-marked, but this time it is dative, as can be 
observed from the clitic doubling in (36), which is plural, agreeing with a unas empleadas. 
DOM is now optional for the direct object of the embedded verb. 


(35) María hizo trabajar los domingos a/*@ un empleado. 
María made work the Sundays bom an employee 


“María made an employee work on Sundays: 


(36) María les hizo visitar oi un enfermo a/*@ unas empleadas. 
María PL.DAT made visit DOM a sick DAT some employees 


"María made some employees visit a sick person: 


López also observes that the same facts hold of perception verbs. The direct object of 
perception verbs are obligatorily a-marked if human and at least specific, as in (37). While 
the subjects of the embedded clause are indirect objects and thus obligatorily a-marked, 
the direct object of the embedded clause in (38) is optionally a-marked. 


(37 María vio caer a/'Q un nino. 
María saw fal pom a child 


"María saw a child falling. 


(38) María vio  a/'O una empleada visitar a/g un enfermo. 
María saw DAT an employee visit | DOM a sick 


María saw an employee visiting a sick person. 


Thus, alternating or blocking DOM by a second a-marked NP can not only be found 
in ditransitive constructions with direct and indirect objects, but also in causative con- 
structions or constructions with perceptual verbs. 


3.3 Semantic and pragmatic effects 


A-marking of indefinite direct objects can signal wide-scope readings, while the lack of 
a-marking often signals narrow scope readings (I leave it open whether the following 
examples are instances of scope or of a referential vs. non-referential reading of the in- 
definite). López (2012: 77) argues that the unmarked direct object un niño “a child” cannot 
take scope over the operator expressed by la mayoria ‘the most’, while the a-marked a un 
niño can. This contrast is also found in ditransitive constructions, as in (40): the a-marked 
version a un niño expresses wide scope (a pragmatically not very prominent reading). 
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(39) Ayer vieron la | mayoría de los hombres a/O un niño. 
yesterday saw the most of the men DOM a child 
"Yesterday most of the men saw a child? 

32 MOST only with pom 

(40) Ayer entregaron a/g un niño a la | mayoría de las 
yesterday delivered Dom a child par the majority of the 
madres. 
mothers 


"Yesterday they delivered a child to most of the mothers? 
3» MOST only with pom 


Leonetti (2004: 102) argues that the a-marked un prisionero in (41) is a more prominent 
binder than the unmarked un prisionero, and therefore can bind the possessive su in the 
indirect object. In the version with un prisionero, the possessive su is most probably 
bound by another antecedent. 


(41) Devolvieron a/Q un prisionero a su tribu. 
They-returned DOM a prisoner to his tribe 


"Ihey returned a prisoner to his tribe? 


3.4 Summary of the observation for DOM in ditransitive 
constructions 


DOM in ditransitive constructions is restricted by the co-occurrence of the indirect ob- 
ject marker a. The very short review above provides the following picture: in most con- 
structions that require DOM in transitive contexts, DOM in ditransitive or causative 
contexts can be blocked by an indirect object realized by a full descriptive noun phrase 
with the marker a. The characteristics of this blocking are still not well-investigated. 


4 A diachronic account of DOM in ditransitive 
constructions 


In this section, I present the results of an intensive corpus search on three types of 
constructions of ditransitive verbs: (i) constructions with indirect objects realized as a- 
marked full noun phrases (definite NPs and indefinite NPs), (ii) constructions with indi- 
rect objects as clitic pronouns, and (iii) constructions with non-overt indirect objects. In 
$4.11 give a short summary of a similar study of Ortiz Ciscomani (2005; 2011), in 84.21 
provide information on how I collected the material and composed the corpus, and $4.3 
the results and discussion of the three hypotheses formulated in $1 are presented. 
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4.1 Earlier studies in ditransitive constructions 


Ortiz Ciscomani (2005; 2011) has analyzed a diachronic corpus of Spanish with respect 
to ditransitive construction from the 13th to the 20th century. In her corpus Ortiz Cis- 
comani (2011: 20) identified 3,061 ditransitive constructions, of which 2,269 occur with 
finite and 792 with nonfinite verbs. For ditransitive constructions with full noun phrases, 
she restricts her analysis to the finite contexts. In her study (Ortiz Ciscomani 2005), she 
investigates the 13th, 14th, 16th, 19th and 20th century with 1,661 ditransitive construc- 
tions with 141 instances of full human noun phrases for the direct object and for the 
indirect object:*> 


Table 8: Percentage of human direct object with DOM and without DOM with 
respect to all instances of ditransitive constructions (Ortiz Ciscomani 2011: 162; 
Ortiz Ciscomani 2005: 198) 


century % DO with DOM % DO without DOM % total 
13th 2.2% (7/316) 8.2% (26/316) 10.4% (33/316) 
14th 5.2 % (6/115) 30.4% (35/115) — 35.7% (41/115) 
16th 1.1% (6/567) 8.3% (47/567) 9.3% (53/567) 
19th 0.8% (3/381) 1.6% (6/381) 2.4% (9/381) 
20th 1.4% (4/282) 0.4% (1/282) 1.8% (5/282) 
total 1.6% (26/1661) 7% (115/1661) ` 8.5 (141/1661) 


Ortiz Ciscomani (2005) observes that (i) the percentage of this construction (with two 
full human noun phrases) with respect to all constructions decreases from 10% and 36% 
in the 13th and 14th century to about 2% in the 19th and 20th century; (ii) that the con- 
trast between DOM and the lack of DOM persists through time. She does not calculate 
the percentages of DOM vs. non-DOM constructions for full noun phrases (both direct 
object and indirect object), but see Comrie (2013: 47) and Table 9 for a different presen- 
tation of the same material such that one can compare the relation between DOM vs. 
non-DOM at each century. It becomes obvious that DOM increases through time even 
though the 19th and 20th centuries provide very few data. Table 9 compares the fig- 
ures for ditransitive constructions with the figures of Laca (2006), see Table 3 above) for 
transitive constructions. One can assume that the stark contrast between definite and 
indefinite direct objects with respect to DOM observed for transitive construction also 
holds for ditransitive construction. 


^Ortiz Ciscomani (2011: 162) notes that languages resist a construction with full noun phrases for a human 
direct and a human indirect object. Only 8.5% of all investigated cases show this configuration. See also 
von Heusinger & Kaiser (2011), who report from similar low percentages of full noun phrases for human 
direct objects in transitive constructions. 

"Note that Ortiz Ciscomani uses two different tables. In her dissertation (Ortiz Ciscomani 2011) she presents 
the table as in Table 10 with all centuries from 13th to 20th, while in her article (Ortiz Ciscomani 2005) she 
only selects 13th, 14th, 16th, 19th and 20th — hence the different numbers. 
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Table 9: Percentages of DOM based on number of ditransitive constructions 
with human direct objects and human indirect objects (reanalysis of Table 6 of 
Ortiz Ciscomani 2005: 198) - compared to the data of transitive constructions 


(see Laca 2006: 442 and Table 3 above) 


?; of DOM with ditr. verbs for 
definite and indefinite NPs 
(Ortiz Ciscomani 2005) 


% of DOM with tr. verbs 
(Laca 2006) 


cent. cent. definite NPs indefinite NPs 
13th 21% (7/33) 12th 36% (13/36) 0% (0/6) 

14th 15% (6/41) 14th 55% (36/66) 6% (2/31) 

16th 11% (6/53) 16th 70% (85/122) 12% (7/59) 
19th 33% (3/9) 19th 96% (73/76) ` As (12/29) 
20th 80% (4/5) 20th o e 

total 18% (26/141) total 69% (207/300) 17% (21/125) 


Ortiz Ciscomani (2011: 166) also observes that only certain ditransitive verbs are con- 
structed with DOM, as Table 10 shows. 


Table 10: Verbs with DOM in ditransitive constructions with human direct ob- 
jects and human indirect objects (Ortiz Ciscomani 2011: 166; my own transla- 


tion, KvH) 
century 
Verbo 13th 14th 15th 16th 17th 18th 19th 20th Total 
dar ‘to give’ 2 1 1 1 5 
enviar ‘to send’ 2 4 1 16 
encomendar ‘to entrust’ 1 1 1 5 
toller ‘to take away’ 1 1 
echar ‘to throw’ 1 1 2 
llevar ‘to carry’ 1 1 
entregar ‘to submit’ 1 1 
mandar ‘to order, to send’ 1 1 
mostrar to show” 1 1 
presenter ‘to present’ 1 1 
total 7 6 5 6 1 3 4 (34/2269) 1.5% 


To summarize, Ortiz Ciscomani (2011) provides the first quantitative approach to the 
diachrony of ditransitive constructions. She has analyzed more than 3,000 sentences 
with ditransitive constructions, of which less than 10% are with a human full NP as indi- 
rect object and a human full NP as a direct object. There are less than 20% of instances 
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with a-marking for both arguments and the data suggest a development towards this 
kind of marking (and less blocking). However, data are very scarce and therefore quan- 
titative conclusions cannot be drawn from her analysis. She has also identified certain 
verb classes that allow DOM in this construction. While this study is very instructive, it 
needs complementary studies in larger corpora. 


4.2 Data collection 
4.2.1 Method 


In order to complement the corpus study of Ortiz Ciscomani (2005; 2011), I started an 
extensive corpus search focused on particular verbs. I used Mark Davies’ Corpus del 
Español, which comprises 100 million words of Spanish texts from the 12th to the 19th 
century. The corpus interface allows one to search for lemmas, rather than for word 
forms. In a first step I identified the verbs to be analyzed. I started from Ortiz Ciscomani's 
(2005; 2011) list of verbs and modified it according to assumed verb properties and their 
behavior in contemporary Spanish. I identified two verb classes with two verbs each: A: 
verbs of caused perception (presentar 'to present', recomendar 'to recommend'; and B: 
verbs of caused motion (enviar ‘to send’, poner ‘to put’). 

In the Corpus del Español, I searched for the corresponding lemmata for presentar for 
four different centuries: 17th, 18th, 19th, and 20th, for recomendar I collected data from the 
18th and 20th century and for enviar and poner from the 17th and 20th century. When the 
search resulted in more than 1,000 hits per century, the search was restricted to the first 
1,000 hits and filtered to cases with human full noun phrases as direct objects (definite 
NPs and indefinite NPs), since only those cases qualify for DOM. I distinguished three 
types of constructions: (i) The indirect object is realized as a human full noun phrase. 
(ii) The indirect object is realized by a clitic pronoun, and (iii) the indirect object is not 
overtly realized, i.e. the construction looks like a transitive construction. E.g. the search 
for the lemma presentar resulted in 1,031 hits from the 17th century. I analyzed the first 
1,000 hits; there were 47 instances with a human full noun phrase as direct object. Out 
of these 47 cases, there were 8 (246) with a human full noun phrase as indirect object; 
18 (2416) instances of the indirect object realized as a clitic pronoun, 18 (9+9) instances 
of no overt indirect object, and 3 cases I could either not analyze or not categorize into 
one of the three categories. For the first three categories I distinguished between DOM 
or the lack of it, as summarized in Table 11. 


Table 11: Sample analysis for presenter 'to present' for the 17th century in the 


Corpus del Español 
cent fullhumanIO IO as clitic only no overt IO hits 
DOM NnoODOM DOM  noDoM DOM noDOM else analyzed searched all 
presentar 17 2 6 2 16 9 9 3 47 1000 1031 
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About 13,000 entries in total were analyzed out of which about 600 had a human direct 
object, i.e. a direct object that can be optionally a-marked. Some verbs and constructions 
had to be eliminated such that eventually 322, i.e. about 2.5% of the analyzed hits, could 
be used for the final analysis, as presented in Table 12. 


Table 12: Overview of the distribution of hits to verb classes and DOM vs. no 
DOM instances in the Corpus del Español (17th to 20th century). 


Verb class DOM noDoM Sum 
A (caused perception) 64 61 125 
presentar ‘to present’ 54 50 104 
recomendar ‘to recommend’ 10 11 21 
B (caused motion) 92 105 197 
enviar 'to send' 73 90 163 
poner ‘to put’ 19 15 34 
total 156 166 322 


4.2.2 Analyzing particular examples 


Before I discuss the overall results, I will present some particular examples in detail. This 
will provide more information about the structure of the examples, but also show that 
in each particular case, additional parameters might have contributed to the a-marking 
of the direct object, its blocking or its lack of a-marking (one cannot always clearly 
distinguish between a blocking effect and a case in which a-marking is not licensed due 
to other parameters). In order to facilitate the reading of the examples, I annotated the 
subject (Sub), the direct object (DO), the indirect object (IO) and highlighted the verb, the 
direct object and the indirect object. In some cases I mark long noun phrases by brackets 
for the ease of parsing. In (42) the direct object el celebrado don Diego de Covarrubias 
y Leiva ‘the celebrated don Diego de Covarrubias y Leiva’ is a-marked besides the a- 
marked indirect object a nuestro obispado ‘our bishopric’. In (43) the direct object los 
enfermos is not a-marked, even though the construction and word order are very similar. 
There are clear differences between the two direct objects: the direct object in (42) is a 


Four other verbs had to be excluded from further analysis: the search for the lemmata acusar (‘to accuse’) 
and denunciar (‘to denounce’) resulted in only transitive constructions, but not in ditransitive constructions. 
I also excluded the verb encomendar (‘to entrust’, ‘to (re)commend’) as it seems to be conventionalized in 
using it with an indirect object either a Dios (‘God’) or a la Madre del cielo (‘the mother of heaven’). The 
great majority of these examples have a-marking for the direct object. I speculate that the meaning is 
conventionalized and understood as an opaque idiomatic expression. I also excluded the 16 instances of 
dar “to give’, since they were difficult to categorize and often close to idiomatic or light verb constructions, 
as well as all bare nouns and proper names since their referentiality status obligatorily determines DOM 
or no DOM, respectively, see §4.3.1 and Table 13 below. 
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proper name, is singular and has much more descriptive content — all parameters known 
to contribute to DOM. 


(42) (ia): DOM and full indirect object 
Promovido a Valencia don Martín Pérez Ayala, presentó el reys,y a nuestro 
obispado jo [al celebrado don Diego de Covarrubias y Leiva]po, que al 
presente era obispo de Ciudad Rodrigo. (Colmenares, Diego de. (1586-1651), 
Historia de la insigne ciudad de Segovia y compendio de las historias de Castilla) 
"After the promotion of don Martín Pérez Ayala to Valencia, the king introduced 
to our bishopric the celebrated don Diego de Covarrubias y Leiva who 
currently was the bishop of the city (of) Rodrigo” 


(43) (ib): no DOM and full indirect object 
Los Médicoss,y son loss, que presentan al Rey¡g los enfermospo. (Feijoo, 
Benito Jerónimo (1676-1764), Cartas eruditas y curiosas, vol. 1) 


“The doctors are the ones who present the sick to the king? 


In (44) the indirect object is realized as the postclitic pronoun os, and the direct object 
al señor conde del Verde Saúco is a-marked. In (45), however, the direct object profetas y 
doctores is unmarked. Again, there are further differences between these two examples: 
the direct object in (44) is a proper name, while it is a plural indefinite in (45). Accord- 
ing to the Referentiality Scale a proper name obligatorily takes DOM, while a plural 
indefinite can take it optionally. 


(44) (iia): DOM and indirect object realized as clitic pronoun 
Tengo el honor de presentar-osjo [al señor conde del Verde Satico]po, de quien 
acabamos de recibir esa carta pidiéndonos nuestra hija en matrimonio. (Larra, 
Mariano José de. (1809-1837), No más mostrador) 


‘I have the honor of introducing to you the count of Verde Sauco [...]: 
(45) (iib): no DOM and indirect object realized as clitic pronoun 
Con estas dos causas, que una bastara ante vos, parezco, y [profetas y 
doctores]po por testigos osjo presento. (Calderón de la Barca, Pedro. 
(1600-1681), El pleito matrimonial del cuerpo y el alma) 
“With these two cases I appear, so that one should suffice before you, and I 
present to you prophets and doctors as witnesses: 


In the following two instances, the indirect object is not overtly expressed. In (46) the 
descriptively rich proper name is a-marked, while the indefinite plural noun phrases in 
(47) are not. This seems to replicate the effect of (44) vs. (45) in that the position on the 
Referentiality Scale determines DOM. 


(46) (iiia): DOM and indirect object not overtly realized 
Tuvo el emperadors,y aviso en Alemania de la muerte de nuestro obispo don 
Antonio Ramírez; y presentó para obispo [a nuestro gran segoviano fray 
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Domingo de Soto]po, que interpolado el santo concilio, fue llamado del césar para 
su confesor. (Colmenares, Diego de. (1586-1651), Historia de la insigne ciudad de 
Segovia y compendio de las historias de Castilla) 


“While in Germany the emperor was informed about the death of our bishop don 
Antonio Ramírez; and he proposed as bishop our great Brother Domingo de 
Soto from Segovia who was called by the emperor as his confessor after the 
interpolation of the holy council’ 


(47) (iib): no DOM and indirect object not overtly realized 
Luego veintiocho hermanos conducidos de Juan de Dios; de la Victoria, ochenta, por 
su ministro provincial regidos. Ochenta y seispo San Augustíns, presenta, 
cientopo da San Franciscos,,p, y otros cientopo santo Domingos, da con 
igual cuenta. (Espinosa, Pedro. (1578-1650), Poesía) 


"Afterwards twenty eight brothers brought from Juan de Dios; from Victoria 
eighty, controlled by the provincial minister. San Augustin presents eighty six, 
San Francisco gives hundred, Santo Domingo gives hundred more with 
identical bill’ 


4.3 Main results 
4.3.1 Referentiality 


Referentiality of the direct object is one of the main factors in determining a-marking in 
transitive constructions. This also holds for ditransitive verbs. As can be seen in Table 13, 
nearly all direct objects realized as bare nouns are unmarked and all except one realized 
as proper names are marked. This means that the variation only affects definite and 
indefinite noun phrases. 


Table 13: Referentiality or types of direct objects (bare, indefinite, definite, 
proper name) and DOM in the Corpus del Español (17th to 20th century). 


Type of noun phrase DOM No DOM Sum 

bare noun 10% (1) 90% (9) 100% (10) 
definite NPs 67% 116) 33% (56) 100% (172) 
indefinite NPs 27% (40) 73% (110) 100% (150) 
proper name 99% (66) 1% (1) 100% (67) 
total 56% (223) 44% (176) 100% (399) 


Therefore, the remaining discussion has been limited to definite and indefinite noun 
phrases, 322 hits in total, with nearly as much DOM direct objects as no DOM direct 
objects, as listed in Table 14.” There is the expected difference between these two groups 


Note that the 322 hits are the number that has already been presented in Table 12, where only definite and 
indefinite direct objects were listed. 
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of referential expressions: one third of indefinite noun phrases are marked, while two 
thirds of definites are marked. 


Table 14: Distribution of definite and indefinite direct objects and DOM in the 
Corpus del Español (17th to 20th century). 


Type of noun phrase DOM No DOM Sum 

definite NPs 67% (116) 33% (56) 100% (172) 
indefinite NPs 27% (40) 73% (110) 100% (150) 
total 48% (156) 52% (166) 100% (322) 


4.3.2 Type of ditransitive construction 


Hypothesis 1 said that the type of ditransitive construction determines the blocking 
effect. One distinguished between (i) constructions with indirect objects realized as a- 
marked full noun phrases (definite NPs, indefinite NPs), (ii) constructions with indirect 
objects as clitic pronouns, and (iii) constructions with non-overt indirect objects do not 
show any blocking effect. 

The data show that (i) construction with a full indirect object blocks a-marking of the 
direct object blocks DOM: only 24% of the direct objects are a-marked in this construc- 
tion. On the other side, if the indirect object is not realized, 54% of the direct objects are 
a-marked. This very much corresponds to the percentage of DOM with transitive verbs, 
see Table 3 above. (ii) The construction with an indirect object realized as clitic pronoun 
shows less blocking than the full noun and more blocking than the case without overt 
indirect object.? 


Table 15: Distribution of types of indirect objects in percentage of a-marking 
(absolute values) in the Corpus del Español 


realization of IO full human IO clitic pronoun IO no overt IO sum 


DOM 24% (8/34) 44% (27/64) 54% (121/224) 48% (156/322) 


4.3.3 Diachronic development 


Table 16 summarizes the diachronic development from the 17th/18th century to the 19th/ 
20th century - two centuries have been collapsed in order to have a larger number of 
instances. For a zero realization and the realization by a clitic pronoun of the indirect 


The contrast between these three constructions is not an effect of an uneven distribution of definite vs. 
indefinite direct objects (see Table 14). In all three construction types, the number of definite and indefinite 
direct objects is more or less equal. 
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object, no blocking effect is observable. In both construction types the a-marking in- 
creases over time, such as in the cases of the transitive verbs (see Melis 1995, Laca 2006, 
von Heusinger & Kaiser 2007; von Heusinger 2008, see Table 3-Table 7 in 82 above). 
What is surprising, though, is that for full indirect objects the a-marking of the direct 
object is blocked by 70% and 100%, respectively. This would suggest that only the overt 
a for the indirect object blocks the a-marking of the direct object. Note, however, that 
there were only 7 instances of this construction. 


Table 16: DOM for human full direct objects and 17th/18th vs. 19th/20th century 
in percentage (absolute values) in the Corpus del Español 


cent full human IO pronominal clitic IO no overt IO 
17th/18th 30% (8/27) 22% (7/32) 45% (46/102) 
19th/20th 0% (0/7) 67% (20/30) 60% (75/124) 


4.3.4 Verb class 


The second hypothesis is that verb class differences are mirrored in the blocking of the 
a-marking of the direct object (or in the strength with which the a-marking of the direct 
object has to be obtained). In earlier work it was shown that there is a clear difference 
for different transitive verb classes. According to the study discussed in §2.2 above, tran- 
sitive verbs that require an animate direct object (such as matar ‘to kill’) more often 
take DOM than verbs like tomar (‘to take’) that prefer an inanimate direct object (see Ta- 
ble 4-Table 7 in §2.3 above). In a forced choice experiment conducted by von Heusinger 
(2017), verbs of caused perception (presentar ‘to present’, proponer ‘to propose’ received 
DOM in 54% (98/182) of the cases, while verbs of caused motion (enviar ‘to send’, man- 
dar ‘to send’) received DOM in 65% (119/182) of the cases. Therefore, I predict that in 
the diachronic corpus there will be more verbs of caused motion with a-marking, than 
verbs of caused perception, even in typical blocking contexts. However, as can be seen 
in Table 17, there are more a-marked direct objects with verbs of caused perception (68%) 
than a-marked verbs of caused motion (49%) if the indirect object is not realized. And 
there is a slight preference for a-marking for verbs of caused perception over verbs of 
caused motion in the other conditions as well. 


Table 17: DOM for human animate full direct objects and verb class in percent- 
age (absolute values) in the Corpus del Español 


verb class full human IO IO as clitic only no overt IO 
A: presentar, recomendar 26% (5/19) 45% (25/56) 68% (34/50) 
B: enviar, poner 20% (3/15) 33% (2/6) 49% (87/176) 
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5 General discussion and conclusion 


In §1 I put forward three hypotheses, which are repeated below and which were tested by 
the extended corpus search and the analysis in the last section. Due to the scarce data I 
cannot make any statistically significant claims, but the figures show certain tendencies 
for the hypothesis. 


* HI: The type of the ditransitive construction determines the blocking effect: 


i constructions with indirect objects realized as a-marked full noun phrases 
(definite NPs, indefinite NPs) show a high blocking effect 


ii constructions with indirect objects as pronominal clitics show a low blocking 
effect, and 


iii constructions with non-overt indirect objects do not show any blocking ef- 
fect 


* H2: DOM in ditransitive constructions has a comparable development as DOM in 
transitive constructions. 


e H3 Verb classes differ with respect to the way they influence DOM and DOM- 
blocking. 


The analysis of the corpus data suggests that Hypothesis 1 is correct: Type (i) realizes 
the indirect object as a full noun phrase that is obligatorily marked by a. Here a-marking 
of the direct object is very low. In type (iii), the indirect object is not realized - either be- 
cause the indirect object is inferred from the context or left unspecified. Here, a-marking 
of the direct object is high and similar to pure transitive constructions. In type (ii), the 
indirect object is realized as a clitic pronoun. Here the rate of a-marking lies between 
construction (i) and (iii) - if correct, this is surprising since no overt a for the direct 
object is available. 

The diachronic development of DOM in ditransitive constructions follows the diachro- 
nic development of DOM in transitive constructions. However, the blocking effect for 
construction (i) is becoming stronger over the years. Due to the very low figures I cannot 
estimate whether this is a stable tendency or not. There is no clear evidence for Hypoth- 
esis 3, as the contrast between the two verb classes are minor, except for the transitive 
construal (iii), where a tendency towards more marking of verbs of caused perception 
can be seen. 

The investigation of a corpus of diachronic data of ditransitive constructions in Span- 
ish has revealed that DOM in ditransitive constructions has developed similarly to DOM 
in transitive constructions — along the Referentiality Scale and the Affectedness Scale. 
However, DOM in ditransitive constructions occurs with a lower frequency than in tran- 
sitive constructions. This effect is generally assumed to be the result of some blocking 
between the a-marking of the indirect object and the a-marking (i.e. DOM) of the di- 
rect object. I have investigated three types of ditransitive constructions: (i) with indirect 
objects realized as a-marked full noun phrases (definite NPs, indefinite NPs), (ii) with 
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indirect objects as clitic pronouns, and (iii) non-overt indirect objects. There is a clear 
difference between these three types: DOM is more frequent with (iii) and less frequent 
with (i). The data revealed an interesting interaction with the diachronic development: 
for construal (i) I found more DOM in the 17th and 18th century than in the 19th and 20th. 
The data did not support a strong interaction between verb class and DOM. Nevertheless, 
they show the importance of an analysis that allows to distinguish nominal from verbal 
parameters. 
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Abbreviations 
: iE INF infinitive 
second person DER . 
; IO indirect object 
third person 


MASC masculine 


ACC accusative E 
NOM nominative 


DAT dative 


: : PL lural 
DO direct object pan P pet 
DOM differential object marking H 
SG singular 


GEN genitive 


: E SUBJ subject 
IMP imperative 
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In Samoyedic syntactic objects and, to a much lesser extent, syntactic subjects are morpho- 
logically marked in some way if they pragmatically deviate from the prototypical grammat- 
ical relation they represent. The present paper focuses on the Northern Samoyedic branch 
in this respect, where morphological case and possessive marking, the selection of conju- 
gational patterns and even argument drop is employed to a variable extent in order to as- 
sign grammatical functions and to distinguish between the involved arguments and their 
semantic and pragmatic characteristics. It provides evidence for the fact that the synchronic 
variation in the manifestation and application of these means in the Northern Samoyedic 
languages Nganasan, Tundra Nenets and Forest Enets can be explained by the interrelation 
between the individual developmental paths that specific nominal, pronominal and verbal 
markers have followed. Whereas in Nganasan the morphophonemic change of number and 
accusative case markers in conjunction with possessive morphemes and moreover the gram- 
maticalization of the latter to definiteness markers has resulted in a system of differential 
object marking (DOM) that exclusively applies to nouns, in Tundra Nenets and Forest Enets 
DOM is implemented by the verbal morphology. This variation in differential marking is 
attributable to the fact that the agreement suffixes of the objective conjugation in Tundra 
Nenets and in Forest Enets — but not in Nganasan - have adopted substantial functional 
features of ambiguous object agreement suffixes and at the same time of topic markers. 
An instance of differential subject marking (DSM) only exists in Nganasan. In contrast to 
Tundra Nenets and Forest Enets where the paradigm of personal pronouns has been en- 
riched by suppletive accusative forms, Nganasan relies on morphological realization and 
non-realization in order to mark subject pronouns whose referents do not exhibit the topic- 
and agent-worthiness of prototypical actor subjects but rather combine specific semantic 
and pragmatic features of undergoer objects. 


1 Introduction 


Samoyedic, the eastern principal branch of the Uralic family, nowadays consists of four 
still living language groups: Nganasan with its dialects Vadey and Avam (Helimski 19982: 
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480-482), the Nenets sub-branch, which is split up into Tundra Nenets and Forest Nenets 
(Salminen 1997: 13-14; Nikolaeva 2014: 1-2), Enets with its sub-languages Tundra Enets 
and Forest Enets (Siegl 2013: 45) and finally Selkup, which forms a broad dialect contin- 
uum (Helimski 1998b: 549-550). According to the classical taxonomy, which is illustrated 
in Figure 1, the former three language groups constitute the Northern Samoyedic branch, 
the language area of which is located in North West Siberia and extends from the White 
Sea region in the West to the Khatanga gulf in the East. Selkup is the last survivor of 
the Southern Samoyedic group, which also encompassed the by now extinct Sayan or 
Mountain Samoyedic languages Kamas and Mator (Janhunen 1998: 457-458). Selkup is 
still sparsely spoken in the West Siberian taiga region enclosed by the Ob and the Yeni- 
sei River in the west and the east and by the Turukhan and the Chulym River in the 
north and the south. More recent approaches interlink Nganasan and Mator due to their 
affiliation to the supposedly more archaic, eastern part of Samoyedic by separating the 
former from Nenets and Enets and the latter from Kamas and Selkup (cf. Janhunen 1998: 
458-459; Siegl 2013: 35-36). 


URALIC LANGUAGES 


A O AS ue 


FINNO-UGRIC SAMOYEDIC 
UcRic FINNO-PERMIC NORTHERN SOUTHERN 
LIUC EE. Nganasan Selkup 
Os-Ucnic | Hungarian | PErMIC VOLGAIC Nenets Kamass (+) 
Komi LA E Enets Mator (t) 
` Udmurt 
Khanty | | Mansi SAAMIC-  FINNO-VOLGAIC 
FENNIC Mari 
P dac Mordva 
SAAMIC BALTIC-FENNIC 
South Ume Livonian 
Pite Lule Estonian bold = accusative -m still exists 
North Inari Votic italic = plural -j still exists 
Akkala Skolt Ingrian - differentiation between 
Kildin Ter F innish subjective and objective 
Karelian inflection on verbs 
Veps 


Figure 1: Taxonomy of the Uralic languages with localization of structural 
case/definiteness markers and conjugational splits 
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Samoyedic generally employs differential argument marking (DAM). More precisely, 
syntactic objects and, to a lesser extent, syntactic subjects are morphologically marked 
in some way if they pragmatically or semantically deviate from the prototypical gram- 
matical relation they represent. Like certain languages of the Finno-Ugric branches Ob- 
Ugric and Volgaic, Samoyedic has partially preserved the original Proto-Uralic object 
marker *-m (cf. Figure 1). The plural suffix *-j, which is still present in the Baltic-Fennic 
languages Estonian and Finnish and in Hungarian (cf. Figure 1), has a differentiating 
function, especially in Nganasan. Like the entire Ugric branch and the Finno-Volgaic 
language Mordva, Samoyedic exhibits an essential conjugational split between the sub- 
jective or “indeterminative” inflection and the objective or “determinative” inflection.! 
Especially in the Northern Samoyedic languages finite verbs that inflect in the objective 
conjugation agree not only with the syntactic subject in person and number but also 
with the direct object in number (Abondolo 1998: 27-30). Since the Samoyedic number 
category is subdivided into the values singular, plural and moreover dual, there are three 
agreement paradigms within the objective conjugation of Nganasan, Nenets and Enets. 

Northern Samoyedic makes use of morphological case marking, the selection of conju- 
gation types and even argument drop to a variable extent in order to distinguish between 
arguments and their semantic and pragmatic properties and in order to establish gram- 
matical relations. On the basis of modern Nganasan, Tundra Nenets and Forest Enets data 
that have been made available by the universities of Moscow and Vienna in the context 
of their research projects "LangueDOC" and “Negation in Ob-Ugric and Samoyedic Lan- 
guages (NOS)" (cf. “Data sources")? on the one hand and by Siegl (2013) and Nikolaeva 
(2014) in the data sections of their Forest Enets and Tundra Nenets grammar books? on 
the other hand it will be shown in this paper that they represent different intermediate 
stages in the rise and loss of structural case marking and the development of objective 
suffixes on verbs. 

While $2 presents a cursory overview of argument marking and DAM in Early Uralic, 
83 is dedicated to the mechanisms of DAM in Nganasan. It will turn out that Nganasan 
employs differential object case markers on nouns but does not yet feature any distinct 
structural case marking on personal pronouns. It is argued in 83.2 that the case syn- 
cretism of the latter is resolved by specific restrictions on their morphological realization 
or non-realization, respectively. As shown in 83.3, the agreement suffixes of the objec- 
tive conjugation have not yet adopted any characteristics of grammatical object agree- 
ment markers in Nganasan. They incorporate anaphoric third person object arguments 
by themselves and co-occur with lexical objects only if they are bound as resumptive 
pronouns in a typical left-dislocation construction. $4 and $5 illustrate that in Tundra 


!In classical Uralistics the subjective conjugation is often called “indefinite” conjugation whereas the objec- 
tive conjugation is referred to as "definite" conjugation. 

?The corresponding online corpora consist of various annotated narrative texts and comprise 905 Nganasan, 
260 Tundra Nenets and 229 Forest Enets sentences in total. 

3Siegl's (2013) grammar of Forest Enets contains various narrative texts that consist of 254 Forest Enets 
sentences in total. Nikolaeva's (2014) grammar of Tundra Nenets contains the edited versions of two 
Nenets narrations (comprising 482 sentences) that were recorded by Labanauskas in the early 1990s (cf. 
Labanauskas 1995). 
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Nenets and Forest Enets differential object case marking (DOC) on nouns does not exist. 
However, whereas Tundra Nenets exhibits uniform accusative case marking in its nom- 
inal declension, Forest Enets has lost structural case markers on nouns almost entirely. 
However, as elucidated in $4.2 and 85.2, by now their paradigm of personal pronouns 
has been enriched with distinct accusative forms. Their third person forms are mostly 
dropped in favor of an objective suffix on the corresponding verbal head. However, in 
contrast to the agreement morphology of the Nganasan objective verb forms, the agree- 
ment morphemes of the Tundra Nenets and Forest Enets objective inflection have gained 
essential properties of ambiguous object agreement markers. They are no longer simply 
hosts of the selected object argument. That is why they co-occur with clause-mate ob- 
jects to a variable extent. In Tundra Nenets, as illustrated in $4.3, they predominantly 
specify relevant pragmatic properties of these objects while in Forest Enets, as shown in 
85.3, they have a discriminatory function. 


2 Differential argument marking in Early Uralic 


The main strategies of Northern Samoyedic DAM have their roots in Proto-Uralic. This 
pertains to differential case marking as well as to the conjugational split. Both emerged 
or were already present in some way in the earliest Uralic language periods. 


2.1 The nominal suffixes *-m and *-j in Early Samoyedic 


According to Künnap (2008b: 34-35) Proto-Uralic subject and object nouns were dis- 
tinctively marked with respect to the categories of number and definiteness but lacked 
any case distinctions. Kúnnap (2008b) identifies the singular definiteness marker *-m for 
Proto-Uralic. Katz (1979: 172-175), Janhunen (1982: 29-31) and Honti (1995: 65-67) postu- 
late the existence of the plural morphemes *-t and *-i in Proto-Uralic. Following Mikola 
(1988: 238-239) *-i corresponds to the glided semi-vowel *-j, which as inflectional mark- 
ing derived from an early general augmentative suffix and later functionally contrasted 
with the other plural marker *-t. Katz (1979) argues that "-t performed the function of def- 
initeness marking in the Proto-Samoyedic plural paradigm. The suffix *-j, however, not 
only encoded plurality and the absence of definiteness but also indicated accusative case 
in his opinion. Abondolo (1998: 21) agrees with Katz (1979) regarding the number and case 
marking function of *-j. Like Salminen (1996: 27) and Janhunen (1998: 469; 2009: 63), he 
also defines the Proto-Uralic *-m as a full-fledged object case marker. But he additionally 
points out that *-m originally only attached onto definite nouns. Thus, Abondolo (1998) 
only partially disagrees with Künnap (2008b: 35) who takes the view that marking by *- 
m was generally applied in order to morphologically indicate definiteness in unexpected 
cases. While definiteness, which is connected to the topic-worthiness and animacy of the 
referent, is a prototypical feature of agents, it is highly atypical for patient arguments (cf. 
Kuno 1987: 212-214; Payne 1997: 149-158; Aissen 2003). Since Uralic employs accusative 
alignment with respect to its case and agreement marking, Künnap (2008b) infers that 
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singular objects but not singular subjects of Proto-Uralic were provided with *-m when 
definite. 

Hence, there are different approaches to the Early Uralic object and definiteness mark- 
ing, as well as to the Early Uralic DAM. At least Katz (1979), Abondolo (1998) and Kiinnap 
(2008b) belong to those Uralists who assume that Early Uralic in some sense exhibited 
DOC conditioned by the definiteness and indefiniteness of the lexical nouns involved. A 
definitive rejection or a definitive support of Katz's (1979), Abondolo's (1998) and Kün- 
nap's (2008b) account have not yet been brought forward. Also the question of whether 
Samoyedic has unalteredly inherited the Early Uralic nominal markers or not, is still 
a matter of debate (cf. Mikola 1988: 237; Salminen 1996: 66; Künnap 2008b: 36). Since 
the above mentioned subject and object markers or traces of them are visible in re- 
cent Samoyedic, it seems plausible to reconstruct them into Proto-Samoyedic. Under the 
premise that they were assigned a differentiating function, Early Samoyedic employed 
differential object marking (DOM) by differential case marking. More precisely, Early 
Samoyedic definite singular objects differed from their indefinite counterparts and from 
singular subjects in that they assumed the Uralic *-m-suffix. Definite plural objects dif- 
fered from indefinite plural objects and also from indefinite plural subjects in that they 
exhibited the plural *-t-marker. Indefinite plural objects differed from their definite coun- 
terparts and, moreover, from indefinite plural subjects in that they exhibited the *-i or 
*-j-suffix. Exactly this is schematically summed up in the following table: 


Table 1: Case/definiteness markers on nouns in Early Samoyedic 


singular plural 
definite indefinite definite indefinite 
nominative - Ki 
accusative *-m - “+ *-j 


2.2 The conjugational split in Early Samoyedic 


According to Gulya (1995); Honti (1995; 2009); Abondolo (1998); Havas (2004); Körtvély 
(2005); Künnap (2008a) and É. Kiss (2010), to mention just a few, the conjugational split 
between the subjective and the objective conjugation is also nascent in some of the ear- 
liest Uralic language periods. Honti (1995: 59, 2009: 136-143), Havas (2004: 119-138) and 
Körtvély (2005: 70-88) among others assume that the objective pattern descends from 
definite third person pronouns that encliticized onto finite verbs of transitive clauses. 
They argue that the Uralic third person singular verb forms were the first finite verbs 
that exhibited the conjugational split. In Havas's (2004) and Kórtvély's (2005) opinion, 
this is because only the third person singular verb form of the Early Uralic general con- 
jugation lacked an agreement suffix and therefore allowed for an analysis of the object 
enclitic as an inflectional ending of a special conjugation type. Havas (2004) takes the 
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view that the first and second person objective verb forms emerged much later, after the 
division into the separate Uralic branches. In his opinion the Hungarian first and second 
person objective verb forms displaying a (V)m- or (V)d-suffix used to belong to the com- 
mon Uralic verbal subject agreement paradigm. He argues that they were re-interpreted 
as first and second person finite verbs that include a definite third person pronominal 
object, while finite third person singular verb forms that were followed by a third per- 
son object clitic prevailed as regular agreeing verb forms. Mikola's (1988) and Kórtvély's 
(2005) investigations suggest a similar development for Samoyedic. They point out that 
the recent Samoyedic first and second person singular subjective verb forms came into 
being later than the corresponding first and second person singular objective verb forms. 
Hence, following Havas (2004) and Kórtvély (2005), the Uralic third person singular sub- 
jective form is the only subjective form that is of earlier origin than its objective counter- 
part. This, however, is not in line with Honti's (1995; 2009) considerations. Honti (1995; 
2009) argues for a scenario where the Uralic first and second person objective verb forms 
were analogously created on the basis of verb forms that later made up the subjective 
conjugation or, at least, where these forms arose in tandem with specialized subjective 
forms. 

Künnap (2008a: 191-196) agrees with the approaches by Honti (1995; 2009), Havas 
(2004), and Kórtvély (2005) with respect to the role ofthe third person singular verb form. 
In other words, he also assumes that the development of the Uralic objective conjugation 
started with third person singular verb forms that indicated the presence of third person 
objects. But, similarly to Rédei (1962), he formulates the hypothesis that demonstrative 
suffixes are the source of the verbal objective suffixes. Since, in his view, especially third 
person possessor agreement affixes generally represent such demonstrative meanings, 
they attached to the corresponding third person verb forms in the beginning. With that 
Künnap (20082) is able to explain the match between the Uralic third person possessor 
agreement markers on nominal and pronominal categories and the corresponding third 
person agreement markers on objective verb forms. 

Others, for example Gulya (1995) and É. Kiss (2010), assume that there were various 
conjugation types already in the early language periods of Uralic. Whereas Gulya (1995: 
99) argues for the existence of an intransitive-transitive split in Proto-Uralic, É. Kiss 
(2010: 140-145) traces at least the Hungarian conjugational split back to three separate 
verbal paradigms. In her opinion, these paradigms were a reflex of topic agreement. In 
the presence of a subject topic the clausal main verb agreed with the subject, in the 
presence of an additional object topic it agreed with the subject and the object and in 
the absence of any topic it lacked agreement markers. These three agreement patterns 
melted into two in Hungarian. Especially the objective pattern was composed partly of 
forms agreeing with the subject and partly of forms simultaneously agreeing with the 
subject and the direct object. According to É. Kiss (2010), it used to indicate the topichood 
of the clausal object. 

Hence, whether the conjugational split had a differential argument marking function 
already before the separation of the various Uralic branches is still a matter of debate. 
Honti (1995; 2009); Havas (2004) and Kórtvély (2005) among others are contesting this. 
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They hold the view that the conjugational split had nothing to do with DAM in Early 
Uralic. They argue that the objective marker, which exclusively appeared on certain 
third person verb forms in the beginning, represented a third person pronominal argu- 
ment by itself. Künnap (2008a) and É. Kiss (2010), however, relate the earliest objective 
suffixes or their immediate predecessors, respectively, to the information structure of 
the corresponding clauses. More precisely, in their view these suffixes indicated a non- 
prototypical pragmatic status of objects and were therefore responsible for DOM in some 
sense. 


3 Nganasan: Differential argument marking on nouns and 
pronouns 


Together with Mator, which is extinct probably since the early 19th century, Nganasan 
forms the eastern tract of the Samoyedic language area. As depicted in Figure 1 above, 
Nganasan has preserved the Uralic accusative marker -m as well as the plural morpheme 
-j. These markers are dealt with in 83.1. It is shown that they can be defined as differential 
object case markers in some sense. In $3.2 it is elucidated that the Nganasan paradigm of 
personal pronouns has not yet developed any structural case markers. Argument drop on 
the one hand and morphological realization on the other hand specify the corresponding 
syntactic functions. The agreement suffixes of the Nganasan objective conjugation are, 
as shown in 83.3, still at the outset of their grammaticalization to differential object 
markers. 


3.1 Differential object marking on nouns 


The Uralic case and number markers -m and -j are involved in DOM in Nganasan. The 
morpheme -m nowadays suffixes to Nganasan singular accusative nouns only in case 
they are definite (cf. (1)).^ The definiteness of these objects is always additionally marked 
by a possessor agreement marker. Even if there is no potential possessor that has been 
introduced in the preceding context or discourse, the accusative marker -m precedes 
such a morpheme. 


(1) Nganasan (Avam) (Northern Samoyedic; NOS. mou djamezi.134, 313) 


a. Toti-ra merigiai-? t'entiri-7i-0o n'enat'a-?a 
that-2sc(Poss) quick-GEN.PL make-PF-3SG.RC huge-AUGM 
bakaa-?a-m-ti n'akol'i-?e 


Sscraper-AUGM-ACC-3sG(POss) take-Pr(3sc.sc) 


“He prepared everything and took the big scraper ..? 


^The spelling of the example sentences cited in this article largely complies with the spelling of the cor- 
responding data in the corpora (but see footnote 9). Consequently, the spelling of data originating from 
different corpora may vary slightly even if they document one and the same language. 
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b. Taharida noj-mo tabo-l'i-?e-r. 
now leg-ACC.1SG(POSS)  press-INCH-PF-2SG.SC 


"You are squeezing my leg now: 


Especially the third person possessive suffixes, such as -ti in (1a), have meanwhile 
entered the grammaticalization path to nominal definiteness markers on objects.? They 
have lost their specific reference to any possessing entity via semantic bleaching. As 
shown by Gerland (2014), nowadays they indicate general belonging and thus a certain 
degree of specificity. That is why they are used for expressing prominence or simply 
definiteness in contexts that lack any available possessor. 

Since accusative -m has degeminated in conjunction with the first person singular, 
dual and plural possessive affixes -ma, -mi^ and -mu?, the accusative possessum nouns 
agreeing with any first person possessor are homonymous with the corresponding nom- 
inative forms (Salminen 1996). Hence the object nojma ‘my leg’ of (1b), which was pre- 
sumably pronounced with a gemination of the bilabial nasal -m (*ojmmo) in earlier 
language periods, formally coincides with the corresponding nominative noun. 

The absence of the accusative -m suffix on indefinite singular objects like sanohúaa 
“a larch' and kuba?a “a huge skin' in (2) is not a reflex of the Early Uralic DOM. Rather, 
it has to do with a quite innovative phonological change that has resulted in a regres- 
sive assimilation ensuing from the word final accusative -m and its subsequent apocope. 
Morphophonemic influences of an erstwhile -m morpheme, which was the obligatory 
accusative marker probably till the 19th century (cf. Castrén 1845: 156), can be observed 
on indefinite accusative non-possessum nouns until today (Wagner-Nagy 2002: 71-89; 
Katzschmann 2008: 357-365). 


(2 Nganasan (Avam) (Northern Samoyedic; NOS. kehy luu.114, NOS. mou 
djamezi.110) 


a. Sonahúaa  nodi-?o. 
larch(acc) find-Pr(asc.sc) 
“He found a larch’ 


b. .. binti?s'i nenat'a-7a kuba-7a tada-fa 
wolverine huge(Acc)-AuGM  skin.acc-aAuGM _ bring-PF(38G.sc) 


*... he brought a huge skin of the wolverine. 


Plural definite object nouns like s’iart’i ‘the news’ in (3a) match the corresponding 
possessum nominative nouns. Like the latter they undergo a stem alternation and display 


?In accordance with Hopper & Traugott (1993: 2) I define all diachronic processes where a specific lexeme or 
discourse structure receives a grammatical function or where a function word or a functional morpheme 
becomes more functionalized through time as instances of grammaticalization. For the sake of simplicity I 
do not draw a distinction between “primary” and “secondary” (cf. Traugott 2004) grammaticalization. 

SToivonen (1998), Bartos (1999) and Dékány (2015) among others have observed a similar distribution of third 
person possessor agreement affixes in some Ugric and Saamic-Fennic varieties. According to them, these 
affixes have lost their person specification. They are suitable for speech act participant (SAP) as well as for 
non-SAP possessors. They merely indicate that the referent of the nominal expression they are attached to 
is in some possessive relationship. 
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a possessor agreement affix, which is phonologically shaped by the formerly preceding 
connective morpheme *-j (Wagner-Nagy 2002: 84). According to Janhunen (1982: 29- 
32) exactly this Uralic connective *-j has become the plural accusative marker in Early 
Samoyedic. In recent Nganasan it suffixes to all indefinite plural objects. This is shown 
in (3b) where the indefinite object lataaj ‘bones’ exhibits a final -j morpheme. With that 
the indefinite objects morphologically differ not only from their definite counterparts 
but also from the non-possessum plural subjects, which exhibit the plural marker -? 
like miroima? (‘the steps”) in (3c). As shown by Mikola (1988: 238), -? is an immediate 
descendant of the Proto-Uralic plural marker *-t. 


(3) Nganasan (Avam) (Northern Samoyedic; NOS. mou djamezi. 173, 062, 130) 


a. Bəńďə  toenifia s'ior-ti d'ebta-7a. 
allacc) so affair(acc)-PL.3sG(poss) tell-Pr(asc.sc) 
“He told all the news? 

b. Taharida — satora-nku maa-güo  hün's'aroad'oa  latoo-j 
now polar.fox-pIm what-cL  ancient(acc) bone-acc.PL 
nonai-? toda-1a. 


one.more-GEN.PL  bring-Pr(3sG.sc) 
“Then the little polar fox brings some old bones. 


c. .. miroima-?  sojbu-?o-? n'enama-gito. 
step-NOM.PL  begin.to.sound-PF-3PL.sC  neighbour-ABL.PL 


"Ihe steps of the neighbour resounded. 


Dual objects are exempted from DOM. On the one hand, this is because there is no spe- 
cific agglutinative accusative morpheme in the dual number. On the other hand, duality 
is in some sense associated with the cohesiveness of the involved participants anyhow. 
As a consequence, dual objects normally display a possessor agreement affix in Ngana- 
san like in all other Samoyedic languages - irrespective of how definite they are. Thus, 
they are naturally syncretic with the corresponding nominative dual possessum nouns. 

Consequently, there is DOM only on singular and plural nouns in contemporary Nga- 
nasan. The accusative marker -m suffixes to singular definite objects and is always accom- 
panied by a possessor agreement affix. In this way Nganasan definite singular objects 
differ from their indefinite counterparts, whose accusative marker has demorphologized 
and which moreover lack any possessor agreement suffix. The accusative marker -j, how- 
ever, suffixes to indefinite plural objects. Accordingly, Nganasan indefinite plural objects 
differ from their definite counterparts, whose former number marker and predecessor of 
the accusative -j has demorphologized and which moreover take a possessor agreement 
affix. Exactly this is summed up in Table 2.’ 


Tsa = stem alternation; poss = possessor agreement morpheme 


353 


Melani Wratil 


Table 2: Structural case/definiteness markers on nouns in Nganasan 


singular plural 
definite indefinite definite indefinite 
nominative - - -? -? 
accusative (SA)-m-POSS (SA) (SA)-POSS (SA)-j 


3.2 Differential argument marking on personal pronouns 


Table 3 illustrates that Nganasan personal pronouns do not show any morphological 
distinction between their structural case forms (cf. Wagner-Nagy 2002: 93). 


Table 3: Structural case paradigm of Nganasan personal pronouns (Wagner- 


Nagy 2002) 
nominative accusative genitive 

1SG mana mana mana 
2SG tənə tənə tənə 
3SG siti siti siti 
IDUAL mi mi mi 
2DUAL ti ti ti 
3DUAL siti siti siti 
1PL min min min 
2PL tin tin tin 
3PL sitin sitin sitin 


Thus, Nganasan personal pronouns are at first glance inconsistent with the common 
markedness hierarchies of DOM, which predict that pronouns are generally more likely 
to be case marked than lexical nominal expressions (Bossong 1985; Croft 1988; Aissen 
2003). However, it has been shown in Wratil (2013) that, although the Nganasan system 
of personal pronouns does not employ any overt case marking of direct objects, it does 
not constitute a categorical exception to these hierarchies. This is because the individual 
grammatical function of its pronominal items is determined on the basis of their morpho- 
logical realization and non-realization. Whether and in which way personal pronouns 
appear is constrained by the ranking of their thematic roles in the actor and undergoer 
hierarchies as well as by their person feature value. Following Van Valin (2001: 53-72) 
the actor and the undergoer hierarchy can be outlined as follows: 


(4) Actor Hierarchy 
Agent > Instrument > Experiencer > Recipient 
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(5) Undergoer Hierarchy 
Patient > Theme > Stimulus > Experiencer > Recipient / Goal / Source / Location 


According to the actor hierarchy, the agent role has the most actor-like properties. It 
is the prototypical thematic role of all arguments that refer to acting, initiating, willing 
and mostly human entities. According to the undergoer hierarchy the patient role has the 
most undergoer-like properties. It is the prototypical thematic role of all arguments that 
refer to undergoing, passive and often non-human entities that are affected by an event 
or action. Experiencer and recipient roles combine actor and undergoer properties. They 
are low in the actor hierarchy as well as in the undergoer hierarchy. The correspond- 
ing referents are affected by conditions, situations, impressions or actions but are not 
completely passive and powerless. In most cases they are animate and willful entities. 

In Nganasan the realization of subject pronouns is constrained by the thematic role 
they bear (Wratil 2013: 248-262). The more actor-like the thematic role of a subject per- 
sonal pronoun is, the more likely it is unmarked, hence, the less likely it is to be realized 
as a free pronoun. On the other hand, the more undergoer-like its thematic role is, the 
more likely it is to have a morphological representation as one of the pronominal items 
illustrated in the first column of Table 2. This is illustrated in examples (6) and (7). 


(6) Nganasan (Avam) (Northern Samoyedic; NOS. mou djamezi.022, NOS. kehy 


luu.021) 

a. (“Sitin) tahariaa maara-j kota-ka-ndu-?. 
(they) now any-ACC.PL  destroy-ITER-AOR-3PL.SC 
"Ihey kill everything: 

b. Maa-da (tono) mono muarkuj-nu-au-n? 
what-ABL.ADv (“you) I torment-INTERR-EXCL-28G.SC 


"Why are you tormenting me?' 


(7) Nganasan (Avam) (Northern Samoyedic; NOS. kehy luu.036, Languedoc. dva 
¿uma.023, Languedoc. $kola.024) 


a. N'enat'a-?a hüda-?a kat'ami-2o. 
huge(acc)-AUGM  tree(ACC)-AUGM  See-AOR.35G.SC 
“He noticed a tall tree? 


b. Mana taasada_ ta? namnam-sua-m. 
I totally you.know  be.hungry-Psr-1sc.sc 


‘I was totally hungry: 
c. “(Mi) tando Siadir-mani nimi-la-ri-?i-ni? 
we.DU thatGEN  window-PROL  drag-INCH-PASS-AOR-1DU.RC 


"We were dragged through the window: 


In (6a) and (6b) the finite lexical verb selects a subject that features most characteris- 
tics of a prototypical agent. Its referent is acting, initiating, willing and animate. Conse- 
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quently, it is not morphologically realized as a personal pronoun. Its person and number 
features are specified by the inflectional morphology of the corresponding verb. In (6a) 
the subjective subject agreement suffix of the main verb indicates that the clausal subject 
is a third person plural subject. In example (6b) it identifies a second person singular sub- 
ject. By contrast, (7a) and (7b) contain a main verb that assigns its subject an experiencer 
role. Since the experiencer role is quite low on the actor as well as on the undergoer 
hierarchy, the corresponding pronominal subject may be omitted like in (7a) or mor- 
phologically realized like in (7b). As shown by Wratil (2013: 257-261), verbs that do not 
assign any specific thematic role like copulas or that withdraw role assignment in some 
sense like negation auxiliaries are also quite liberal with respect to the (non-)realization 
of their pronominal subjects. The same holds true for verbs that background their agent 
argument due to a specific valence or aspect marker. In passive clauses like (7c), how- 
ever, the subject combines all properties of a typical patient. It is therefore necessarily 
realized as overt personal pronoun. 

Direct object personal pronouns, which are normally assigned the undergoer-like 
roles patient and theme, are always overt. Thus, their grammatical relation already deter- 
mines their morphological manifestation as overt free personal pronouns. As illustrated 
by (8) and (6b) above, this holds true at least for the speech act participant (SAP) objects, 
i.e. for all singular, dual and plural object personal pronouns with a first or second person 
specification. In (6b), for example, the transitive main verb takes a first person singular 
object and in (8a) a second person plural object, which is morphologically realized as tir. 
The finite verb of (8b) follows its first person dual object mi. 


(8) Nganasan (Avam) (Northern Samoyedic; NOS. mou djamezi.223, Languedoc. 


$kola.034) 
a. taharida timinía tin nada-rki-?a-m 
now now you.PL(ACC) examine-RES-AOR-1SG.SC 


‘Now I will search you’ 


b. Bejki?miàfku  t'üü-tü kunsi-mani mi 
Beikimyaku  sleeping.bag-GEN.3sG(POss) inside-PROL we.DU(ACC) 
mütomi-?2 


put-AoR(3sG.sc) 


‘Bejkimjaku puts us in her sleeping bag. 


Accordingly, the quite unusual lack of structural case marking within the Nganasan 
paradigm of personal pronouns is compensated for by a system of realization and omis- 
sion. Whereas SAP objects are always realized by overt free personal pronouns, sub- 
ject personal pronouns are morphologically realized only if their thematic role deviates 
from the thematic role prototypical subjects are assigned to. Consequently, Nganasan 
employs a strategy of DSM that is mainly conditioned by semantic roles. Thus, it is an 
atypical instances of DSM. But in some sense it is also a reflex of the topic-worthiness 
of referents. More precisely, only Nganasan subjects that bear properties of high topic- 
worthiness such as definiteness and/or animacy and moreover adopt a thematic role 
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that is extremely high on the actor hierarchy are completely unmarked, hence, lack any 
morphological representation. 


3.3 Argument incorporation and objective conjugation 


The number of the third person personal pronouns siti and sitir that occur as direct 
objects in the finite clauses of the accessible corpora is vanishingly small. Nevertheless 
there are numerous two-or more-participant clauses whose finite verb takes a third per- 
son direct object that is definite and anaphoric. However, these clauses as, for example, 
(9a) and (9b), differ from the other two-or more-participant finite clauses not only in 
that they lack any free object but also in that their main verb is inflected in the objective 
conjugation. The respective agreement suffixes are given in Table 4 below.® 


(9) Nganasan (Avam) (Northern Samoyedic; NOS. mou djamezi.153, 241) 


a. Ka’tami-7e-du. 


look-PF-3SG.0C 
“He has looked at it. 


b. kuni-de yoto-d iad ‘aa-dun? 
where-ABL  find-PSTPF-3PL.OC 
“Where did they find it?’ 
Table 4: Verbal suffixes of the subjective, objective and reflexive conjugation 
in Nganasan (Wagner-Nagy 2002) 
subjective objective reflexive 
singular dual plural 
1SG -m -ma -kai-j-na -j-na -na 
28G Ro -ra -kai-j-ta -j-ta -n 
3SG Ø -tu -kəi-j-tu -j-tu to 
IDUAL -mi* -mi* -kai-jnif . -j-ni* -nif 
2DUAL -rif -rif -koi-j-ti* -j-ti* -nti 
3DUAL -kaj -tif -kai-j-ti* -j-ti* -nti 
1PL -mu? -mu? -kai-jonu? | -j nu? -nu? 
2PL -ru? -ru? -kai-j-tu? | -j-tu? -ntu? 
3PL -? -tun -kai-j-tun  -j-tun nto? 


As soon as any free pronominal direct object appears within a minimal clause, the 
corresponding main verb inflects in the subjective conjugation the inflectional pattern 
of which is listed in the first column of Table 4. This holds true for all definite object 


5Table 4 only contains the basic morphs of these suffixes. Note that there is a wide range of phonologically 


conditioned allomorphy within the Nganasan agreement paradigms. 
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pronouns as for example for the personal pronouns including all SAP and third person 
pronouns and for all indefinite object pronouns. The sentences of (8) in 83.2 illustrate 
the co-occurrence of SAP objects and finite verbs with subjective patterns. Example (10a) 
belongs to the extremely rare clauses that contain a third person object personal pronoun 
while (10b) and (10c) exhibit indefinite pronominal objects. As can be observed, each of 
these third person objects precedes a subjective verb form. 


(10) Nganasan (Avam) (Northern Samoyedic; NOS. kehy luu.196, NOS. mou 
djamezi.027, 022) 
a. Band’a-? siti  n'üosij-t'i-? tanda  kobtúa-m-tun 
all-PL she kiss-PRS-3PL.sC there  girl-Acc.sG-3PL(POSS) 
n'üasi-ndi-?. 
kiss-PRS-3PL.SC 
‘All people kissed her, they kissed their girl there’ 
b. maa nakala-ta-ni 
what(Acc) take-FUT-INTER(3SG.SC) 
“What does it take?” 


c. taharida maara-j kota-ko-ntu-? 
now any-ACC.PL  bag-ITER-PRS-3PL.SC 
"Ihey kill everything: 


The vast majority of clauses that display a non-pronominal direct object are also 
headed by a finite verb inflected in the subjective conjugation. None of the minimal 
clauses containing a non-pronominal object constituent mentioned in $3.1 exhibits a ver- 
bal head that bears an objective suffix — irrespective of whether this object constituent 
is definite or indefinite. The example clauses (11a) with a definite object and (11b) with 
an indefinite object are further examples that illustrate the subjective inflection due to 
the presence of any free object. 


(11) Nganasan (Avam) (Northern Samoyedic; NOS. kehy luu.149, Languedoc. 


koujkia.006) 

a. ponoi-? Sigifi-? luu-Zo-m-tu 
one.more-ADV  Ogre-GEN.PL parka-AUGM-ACC-SG.3SG(POSS) 
seri-?2 


put.on-Pr(3sG.sc) 
‘He has put on once more the ogre's parka’ 

b. Ta-gata  lakariarià? maagiia saü d'indi- ?a-gaj. 
that-aBL suddenly somewhat noise(acc) hear-PF-3DU.sC 


“Then they suddenly heard some noise’ 


In turn, constructions whose main verb exhibits an objective suffix alongside a non- 
pronominal accusative object constituent are extremely rare. As has been elucidated in 
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Wratil (2013: 251-257), these object phrases have in common that they refer to topic 
entities. They represent old or contextually presupposed information and are marked as 
being definite by an appropriate possessor agreement morpheme. Moreover, they appear 
in the left-peripheral position. This is illustrated by (12b). The unambiguously accusative 
noun banamtu ‘dog’ establishes the white dog, which has been introduced earlier in the 
discourse (cf. (12)), as the primary topic. 


(12) Nganasan (Avam) (Northern Samoyedic; Languedoc. rebjata. 031, 033) 


a. ta-ta taharíabi? /nerabtuku?/ nima hon-tio ban-tu 
wel now /at.first/ name(Acc) have-PrCP  dog-3sG(POSS) 
tai-Stia denkua  banu-Ta toti — bojka?a 


be.available-pst(3sc.sc) white dog-aucm that oldman 


“The famous white dog originally belonged to the old man? 


b. ban-am-tu Stikiida-?a-du taharíaa buagalida-j 
dog-Acc-3sc(POss) strangle-AOR-35G.0C now good.words-Acc.PL 
nantama-ga-sa ban-am-tu mütomi-?o debakua 
pray-ITER-INF  dog-Acc.3sc(POss) send-AoR(3sG.sC) red.GEN 
turka-?a na-nta i-$a 


lake-AUGM.GEN  friend-LAT  be-INF 


"Ihe dog he strangled, praying good words he sent the dog to the ground of 
the red lake’ 


Hence, it is at least debatable whether the accusative noun banamtu is part of the min- 
imal clause containing the finite main verb inflected in the objective conjugation at all. 
It is conceivable that banamtu is a left dislocated topic constituent that is referentially as- 
sociated with a clause internal resumptive pronoun or clitic. The agreement suffix of the 
following objective verb form would represent the clause internal resumptive element 
in this case. The fact that banamtu precedes a finite verb inflected in the subjective con- 
jugation in the subsequent asyndetical conjunct (12b), corroborates this analysis. Since 
the discourse properties of the mentioned referent are fully defined by a left dislocation 
procedure in the first conjunct, it behaves like a canonical object in the second conjunct. 

The distribution of objective verb forms described in this section allows to conclude 
that Nganasan is situated on an early stage in the development of the conjugational split. 
Especially the data of (9) and (10) suggest that the suffixes of objective verb forms still 
include pronominal third person object arguments by themselves. Note that this incorpo- 
ration hypothesis complies with Havas's (2004) and Kórtvély's (2005) assumptions about 
the roots of the Uralic objective conjugation. According to these considerations their in- 
compatibility with free clause-mate accusative pronouns can be quite convincingly ex- 
plained. Since pronominal clitics may be bound as resumptive elements by a topicalized 
object phrase in clitic left-dislocation constructions, sentences like (12b) also fit this anal- 
ysis. But (12b) supports É. Kiss's (2010) topic agreement approach to the evolvement of 
the objective conjugation as well. This is because the objective verb form süküóa?aóu 
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'strangled' in some sense points to the special topic status of the sentence initial object 
constituent. 


4 Tundra Nenets: Information structuring and the 
objective inflection 


Tundra Nenets is the language spoken by the westernmost speech community of the 
Northern Samoyedic region (cf. Abondolo 1998: iv; Nikolaeva 2014 among others). In 
contrast to Nganasan, Tundra Nenets does not exhibit DOM on nouns. This is shown in 
84.1. Since, as pointed out in 84.2, its paradigm of personal pronouns has been enriched 
with distinct accusative forms, Tundra Nenets also lacks DSM within its pronominal 
system. Nevertheless Tundra Nenets employs DAM in some sense. This is because, as 
elucidated in $4.3, the Tundra Nenets objective suffixes have acquired the essential fea- 
tures of ambiguous verbal agreement markers in the sense of Siewierska (1999: 225-331) 
and at the same time assumed an information structuring function. 


4.1 Uniform structural case marking on nouns 


DOC does not apply to Tundra Nenets nouns. Uniform accusative case marking prevails 
instead. In the singular number this is attributable to the analogical extension of the 
Uralic nominal marker *-m to all kinds of lexical objects. Therefore contrary to Nganasan, 
which has retained *-m merely in connection with possessor agreement markers, Tundra 
Nenets lacks differential accusative marking on object nouns in the singular number. This 
is illustrated by the example sentences of (13).? 


(13) Tundra Nenets (Northern Samoyedic; NOS. tesjada nisjami.058, 023, NOS. tet 


weli teta.105) 
a. Ne tarem ma: 
woman so say(3sG.sc) 


"Ihe woman said: ... 

b. Ti Tes'ada n'is'e-mi m'apoj-m pod'erna. 
so Tesjada father-1sc(poss) small.reindeer.caravan-Acc harness(3sG.sc) 
“So, my father Tesjada harnessed a small reindeer caravan. 


c. Narka Wel'i teta xasawa n'u-m malca-xa-danta 
big Welji farmer man child Ac — malice-DAT-3sG.DAT 


"There is a phonemic difference between the nasalizable and the non-nasalizable glottal stop. The former 
is marked by h and the latter by q in a number of treatments of Nenets phonology and morphology (cf. 
i.e. Salminen 1998: 522-523; Nikolaeva 2014: 18-19). For the sake of simplicity, I follow Hajdú (1988) in not 
drawing a graphemic distinction between the nasalizable and the non-nasalizable glottal stop. This pertains 
to the following example sentences and tables, where ? covers both kinds of glottal stop. 
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nixibta-da, | man-ma: 
pull-3sc.oc  pull-NARR 


“He caught hold of the malice of the son of the old Weli-farmer and said: 


The indefinite singular direct object in (13b) as well as the definite singular direct object 
in (13c) displays the accusative case marker -m. Due to this marker the singular objects 
of Tundra Nenets uniformly differ from the corresponding syntactic subjects, which are 
not case marked at all, such as n’e ‘woman’ in (13a). 

Leaving aside the dual object forms, which do not exhibit any specific case morpheme 
(Salminen 1998: 538; Nikolaeva 2014: 57-58), the uniform object case marking on Tundra 
Nenets object nouns in the plural number is simply due to the regular suffixation of the 
accusative plural marker -j. Nowadays -j has undergone a process of de-morphologization. 
As a result, the recent Tundra Nenets accusative plural objects are subject to a stem alter- 
nation (Mikola 1988: 238). Examples are given in (14), where (14a) displays the indefinite 
plural object noun fi ‘reindeers’ and (14b) the definite plural object p'ib'i ‘boots’. Both 
of them have undergone a vowel change. 


(14) Tundra Nenets (Northern Samoyedic; Nikolaeva 2014: 472, NOS. tesjada 
nisjami.037) 
a. Tad’xaw’? yur” man” ti nikelna. 
now(AFF) hundred about reindeer(pL+Acc) set.apart(3sG.sc) 
'It split up about a hundred reindeer (from the herd): 
b. Pi sawo jern’a pibi s'era-dm, | wen'eko-dar'em 
night good in.the.middle.of boot(pL+Aacc) put.on-isc dog-EQU 
pin n'alkara-dm. 
out  slink-1sc 
“In the middle of the night I put on my boots and slipped out of the tent like a 
dog? 


The latter sentence as well as (13b) shows that definiteness is not a sufficient condition 
for the suffixation of possessive markers in Tundra Nenets. In (13c) n’um ‘child’ is defi- 
nite not only because of its thematic status in this part of the narration but also because 
of its close affiliation to Wel’i, who is one of the protagonists of the story. The definite- 
ness of p'ib'i ‘boots’ in (14b) is due to its immediate associative relation to the first-person 
narrator. Nevertheless, neither n’um nor p’ib’i displays any possessive suffix. This is be- 
cause the Tundra Nenets nominal possessor agreement markers predominantly specify 
possessivity relations between possessum nouns and possessors. They do not function 
as object definiteness markers and let alone as differential object markers. 

Plural object nouns displaying a possessor agreement marker are completely homony- 
mous with the corresponding nominative possessive forms (Nikolaeva 2014: 59). Since 
possessum subjects formerly also exhibited the suffix -j as a connective morpheme, they 
feature the same alternation as the plural accusative forms. This is shown in (15a) and 
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(15b). The nominal stem te 'reindeer' has undergone vowel change owing to the for- 
mer suffixation and subsequent de-morphologization of -j in its accusative and in its 
nominative form. It cannot unambiguously be identified as subject or as object on the 
morphological level. 

(15) Tundra Nenets (Northern Samoyedic; NOS. tet weli teta.020, 022) 


a. Tet  jonar? tí-da nob-t mandal'a-d?. 
four thousand reindeer(NOM+PL)-35G(POSS) one-DAT  assemble-3Pr.RC 
“His four thousand reindeers assembled in one group. 

b. Tiki tí-da jarka, pod'er-ja-da. 
that reindeer(Acc+PL)-3sG(Poss) catch(3sc.suBJ)  harness-PL.0-3sG.OC 


‘He caught and harnessed these reindeers: 


The non-possessive plural subject forms, as illustrated in (16), however, differ from the 
corresponding non-possessed objects in that they are provided with the plural suffix -7. 


(16) Tundra Nenets (Northern Samoyedic; NOS. tet weli teta.094, 141) 
a. .. n'enaca-? jab'el-mi-d 
man-PL(NOM) make.drunk-PTCP.PASS-3PL.RC 
^... the people get drunk? 


b. .. Weliteta-? jamdaj-d?. 
Weli.land.owner-PL(NOM)  leave-3PL.RC 


*.. the Weli-farmers left? 


Thus, Tundra Nenets employs DOM neither on singular nor on plural accusative nouns. 
It exhibits uniform structural case marking instead. Exactly this is outlined in Table 5. 


Table 5: Structural case markers on nouns in Tundra Nenets (Nikolaeva 2014: 


61) 
singular plural 
definite indefinite definite indefinite 
nominative - - -? -? 
accusative -m -m (sa) (sa) 


4.2 Suppletion in the paradigm of personal pronouns 


In contrast to the Nganasan paradigm of personal pronouns, the Tundra Nenets set of 
personal pronouns morphologically differentiates between subject and object personal 
pronouns by means of suppletion. As Hajdú (1988: 14-15) points out, this is due to the 
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grammaticalization of the Uralic lexeme $i? ‘shape’. Owing to semantic bleaching $i? has 
become a pronominal stem that currently represents the basis of the accusative and gen- 
itive personal pronouns. The individual person and number specifications of these forms 
are indicated by accusative and genitive possessor agreement suffixes (cf. Table 6).!° 


Table 6: Structural case paradigm of the Tundra Nenets personal pronouns 
(Hajdú 1988: 14-15; Nikolaeva 2014) 


nominative accusative genitive 
1SG man DROCK Sin 
28G pidar Sit sit 
38G pida Sita Sita 
IDUAL mani? Sid’n ‘i? Sid’n ‘i? 
2DUAL pidari? $id'd'i? $id't'i? 
3DUAL pid i? $id'd'i? sidt‘? 
1PL mańa? sid'na? Sid na? 
2PL pidara? Sid'da? sidta? 
3PL pido? Sid'do? Sid'to? 


Moreover there is suppletion for person in the nominative array of the Tundra Nenets 
system of personal pronouns. The first person forms exhibit the stem man, the second 
and third person forms, however, the stem pi. As hypothesized by Castrén (1845), Lehti- 
salo (1939), Hajdú (1953) and Siegl (2008) pi does not descend from the Proto-Uralic or 
Proto-Samoyedic pronoun system. Whereas Castrén (1845: 342) assumed that the stem 
of the second and third person pronouns is originally Turkish, Hajdú (1953) proposes 
a contact-induced transfer from Ket. Siegl (2008: 120-121) finally supports Lehtisalo's 
(1939) hypothesis. He argues that the Tundra Nenets second and third person subject 
personal pronouns result from the grammaticalization of the Samoyedic lexeme pixid 
“body”. 

Regardless of which of these accounts proves right, the Tundra Nenets set of personal 
pronouns has obviously undergone diachronic processes that are not evidenced within 
the corresponding Nganasan system. Because of the exclusively Proto-Samoyedic/Uralic 
origin of its pronominal items, the latter is often conceived of as the most archaic pronom- 
inal system of the Northern Samoyedic languages (Siegl 2008: 120). Contrary to Nga- 
nasan, Tundra Nenets therefore behaves in quite an ordinary way with respect to the 
morphological realization of its pronominal subjects and objects. Owing to the dimen- 
sional progression described above, its subject personal pronouns are realized as overt 
free pronominal items only if they are used for emphasis (Salminen 1998: 540) and object 
pronouns are always overt and free. This applies to the SAP object pronouns. Their third 
person forms are different. As will be shown in the following section, they are neither 
canonical free pronouns nor incorporated objects. 


See footnote 9. 
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4.3 Object topic marking on finite verbs 


The agreement markers of the Tundra Nenets objective conjugation listed in Table 7! do 
not simply incorporate the direct object ofa clause. Although they exhibit some essential 
properties of anaphoric third person objects, they belong to ambiguous verbal agreement 
markers in some sense. 


Table 7: Verbal suffixes of the subjective, objective and reflexive conjugation 
in Nenets (Hajdú 1988: 16-17; Nikolaeva 2014: 78-80) 


subjective objective reflexive 
singular dual plural 

1SG -(d°)m? -w° -xəyu-n° an -w°? 
2SG -n° a dl -xayu-d° -ya-d° -n° 
38G D -da -xayu-da  -y-da -? 
1DUAL -ñi? -rhi? -xayu-nir  -y-ñil -ñi? 
2DUAL -d'i? -fi? -xayu-d'i?  -y-d'i? -d'i? 
3DUAL -x(V^)? -d'i? -xoyu-d'i?  -y-d'i? -x(V^)? 
1PL -wa? -wa? -xayu-na? | -y-na? -na? 
2PL -da? -da? -xayu-da?  -y"-da? -da? 
3PL -? -do? -xayu-do?  -y"-do? -d'? 


They are not completely incompatible with free clause-mate direct objects. However, 
due to their residual pronominal features they impose restrictive requirements on such 
complements. Above all, their third person specification excludes the insertion of SAP 
direct objects. As shown below, first (17a) and second (17b) person objects always precede 
a finite verb inflected in the subjective conjugation, the agreement suffixes of which are 
listed in the first column of Table 7. 


(17 Tundra Nenets (Northern Samoyedic; NOS. tesjada nisjami.060, Nikolaeva 2014: 
447) 
a. Tiki pu-d simi gawla. 
that behind-ABL  me(Acc) feed(3sG.sc) 
"After that she gave me some food: 
b. Xumpa' nci? sit yedara-dam-c”. 
in.vain you(ACC) send-1sG.sC-PST 


‘In vain I let you go? 


Moreover, their extant characteristics of definiteness cause a feature conflict with in- 
definite objects. Accordingly, as illustrated in the following examples, pronominal (18a) 
as well as non-pronominal (18b) indefinite objects obligatorily co-occur with subjective 
verb forms. 


"See footnote 9. 
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(18) Tundra Nenets (Northern Samoyedic; Nikolaeva 2014: 436, NOS. tesjada 
nisjami.076) 
a. Yebtow”, pomke-m | mane?ga-ney'?? 
darling(roc) what-acc  see-2sG.sc.FOC 


“Darling, what can you see?” 


b. Jaxa xara  ta-xana gob m'ad'iko-m xo-dm?. 
river curve there-Loc one small.tent-acc  find-1sG.sc 


“After the bend of the river I found a small tent’ 


The combination of both their third person specification and the definiteness limita- 
tion, finally blocks the appearance of free definite third person pronouns due to redun- 
dancy. This is why finite verbs with objective suffixes identify the referents of unmarked 
non-SAP object personal pronouns exclusively by themselves, cf. (19a) and (19b). 


(19) Tundra Nenets (Northern Samoyedic; NOS. tet weli teta.066, 035) 


a. Xuc'er? minku-da? 
how marry-35G.0C 


“How could he marry her?” 


b. Tad maneya-da. 
then  behold-3sc.oc 


“Then he realized it? 


Definite free-standing accusative third person pronouns are allowed to appear as soon 
as they are emphasized (Nikolaeva 2014: 386-389) or belong to the non-determinative 
demonstrative pronouns. Like the Nganasan free definite pronominal objects they usu- 
ally complement a verb inflected in the subjective conjugation. During her colloquial 
elicitations Nikolaeva (2014: 201-210) recorded a clause like (20a), where the free third 
person singular object personal pronoun s ita ‘him’ receives contrastive stress. The nar- 
rative texts of the Tundra Nenets data base also contain clauses like (20b) the pronominal 
object of which is a demonstrative pronoun bearing a possessor agreement affix. 


(20) Tundra Nenets (Northern Samoyedic; Nikolaeva 2014: 203, 439) 


a. N'is'a-da s'ita lado. 
father-3sc(Poss) him-(acc)  hit(3sG.sc) 
‘His father hit him. 

b. T'ika-xeyu-da pod'erya. 


this-acc.Du-3sG(Poss)  harness(3sG.sc) 


“He harnessed those two. 


?Nikolaeva (2014: 203) points out that some speakers of the Western Nenets dialect group sometimes allow 
the co-occurrence of free third person object personal pronouns and objective verb forms. 
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The only stressed object pronouns that optionally take an objective verb form are 
reflexive pronominal expressions with the stem pixda (Nikolaeva 2014: 203), cf. (21a) 
and (21b). This extraordinary facultative co-occurrence may be due some non-functional 
residue that pixd ‘body’ still bears as a lexical category. 


(21) Tundra Nenets (Northern Samoyedic; Nikolaeva 2014: 203) 
a. pix'do-m'i lad'o-d'm. 
REFL-1SG hit-1sc.sc 
T hit myself” 
b. pix'do-m'i  lad“9-w. 
REFL-1SG hit-1sc.oc 
T hit myself? 


This at least approximately conforms to the fact that the overwhelming majority of the 
free direct objects that are accompanied by a verb inflected in the objective conjugation 
in Tundra Nenets are non-pronominal anyhow (Kórtvély 2005: 122). If, however, a non- 
pronominal complement appears in a Tundra Nenets clause headed by an objective verb 
form, it is definite and refers to an individuated and highly topical entity (Dalrymple & 
Nikolaeva 2011: 125-139). On the morphosyntactic level this is reflected by the suffixation 
of an appropriate possessor agreement morpheme on the one hand and on the other hand 
by its appearance in the left periphery or second position of the clause. Usually, such non- 
pronominal complements immediately follow the syntactic subject like in (22d) or even 
appear sentence initially. The latter is illustrated in (22b) and (23b). 


(22) Tundra Nenets (Northern Samoyedic; NOS. tesjada nisjami.003, 006, 009, 086) 


a. Niis’a-m’i tan’a n'eb'a-mi tan'a n'ud'a 
father-1scG(poss) exist(3sG.sc) mother-1sc(poss) exist(3sG.sc) young 
papa-ko-m’i tan’a. 


brother-DIM-1sG(Poss)  exist(3sG.sC) 
“There is my father, my mother and my little brother. 

b. N'is'a-m'i Tes’ada-ne  peer-c'eti-da. 
father-acc+1sG(poss)  Tesjada-Ess  call-HAB-3sG.OC 
“My father is called Tesjada’ 

c. Nobp-kuna  n'ista-m'i n'eb'a-xa-n'i ma: 
one-Loc father-1sG(poss) mother-DAT-1SG(POSS)  say(3sG.sc) 
“Once, my father told my mother: 

d. N'is'a-m'i jil’e-m’a-m-ta s'eroku-ta s'er 
father-1sc(poss)  live-NMLZ-ACC-3SG(POSS) separate-3sG(Poss) affair 
wad 'eya-da. 

Tell As oC 
“My father told me what he lived through in detail” 
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(23) Tundra Nenets (Northern Samoyedic; Nikolaeva 2014: 452-453) 


a. Nuda Way” xada-wi n'e'ka-xanta tewi'-?. 
litle Waya kill-PsrPr.PrcP elder.brother-DAT.3SG(POSS) arrive-3sG.RC 
"Younger Waya reached the place where his murdered brother lay: 

b. Xalm'era-m-to? sida xoba-? ni?  pega-do?. 
dead.body-Acc-3Pi(POss) two skin-GEN onto  put-3PL.OC 
“They put the dead body (of their brother) onto two skins? 


c. Lobeku-? neba ma: “Nemc'i-da temna sawa-?. 
Lobeku-GEN mother say(3sG.sc) flesh.3PL-3sG(Poss) still good-3PL 


“Lobeku's mother said: “His muscles are still good?” 
d. Xod'ri? ^ yil'le-bt'e-” xorta-nakew’. 
of.course live-CAUS-MOD  try-PROB.1SG.OC 


‘I might try and revive him" 


The boldfaced direct object nouns in (22b) and (23b) are separated from the sentence- 
final objective verb form by at least one constituent. In (22b) n'is'am'i ‘my father’ con- 
verts its referent introduced before (cf. 22a) into the main discourse topic and desig- 
nates with that the protagonist (cf. (22d)) at the very beginning of the story. In (23b) 
xalm'eramto? “dead body”, which refers to Waya's murdered brother and belongs to the 
old information (cf. (23a)), announces the main topic of the following direct speech (cf. 
(23c), (23d). 

Thus, the relation between accusative complements and objective verb forms in Tun- 
dra Nenets is reminiscent of the distribution of objective affixes in Nganasan. The Tun- 
dra Nenets objective markers indicate that the direct object phrase they co-occur with is 
or becomes the main topic of the following discourse. However, in Tundra Nenets left- 
dislocation into any pre-sentential position is no longer an indispensable operation that 
non-pronominal objects must undergo in order to be compatible with an objective verb 
form (cf. (23d)). This implies that the objective affixes on Tundra Nenets finite verbs have 
acquired some relevant properties of grammatical agreement markers. The development 
of such functional features can presumably be described as a grammaticalization process 
that started with the loss of stylistic force which left-dislocated constituents originally 
exerted. As a consequence of this loss the formerly left dislocated constituents were re- 
analyzed as clause-internal topic constituents and the formerly bound resumptive clitics 
as agreement markers attaching to the respective verb under certain conditions. Since 
only non-pronominal constituents underwent topicalization by clitic left-dislocation the 
third person specification of the former resumptive elements has been preserved. And 
since, moreover, the conditions under which these elements appeared in the presence 
of object constituents has always been defined by the pragmatic status of the latter, the 
newly emerged agreement markers unfolded information structuring functions of topic 
markers by the process of pragmaticalization (cf. Diewald 2011). 

It is conceivable that exactly this diachronic process is responsible for the mechanism 
of DOM that nowadays holds in Tundra Nenets. Its objective agreement suffixes on the 
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finite verb indicate that the non-SAP object deviates from the prototypical patient argu- 
ment in that it is definite and establishes the actual discourse topic. Thus, Tundra Nenets 
differentially marks object topics by means of differential object indexing (DOI). 


5 Forest Enets: Differential object marking on finite verbs 


The Enets language area is located in the lower Yenisei region (Janhunen 1998: 457), 
which extends to the Kara Sea in the North. In the west it borders on the Nenets and 
in the east on the Nganasan language area. Its southernmost Samoyedic neighbor is the 
Selkup region. There are two Enets dialects: Forest (Bai) Enets and Tundra (Maddu) Enets, 
the predominant of which, Forest Enets, is considered in the following. 

Forest Enets is in a much more moribund state than Nganasan and Nenets (Siegl 2013: 
30—57). It features a number of morphosyntactic characteristics that have to be seen as 
an advancement of the diachronic processes that are attested for the other Northern 
Samoyedic languages. While the distinct morphology of structural case marking on its 
nouns is progressively eroding, as shown in $5.1, the suffixes of the objective conjugation 
gain more and more weight in the relational assignment of arguments, which is eluci- 
dated in 85.3. The Forest Enets personal pronouns are not affected by the loss of specific 
morphology. On the contrary, similarly to the Tundra Nenets personal pronouns, they 
have established a structural case distinction by the adoption of supplementary forms. 
This is illustrated in $5.2. 


5.1 The erosion of structural case marking on nouns 


In Forest Enets the Uralic nominal accusative marker *-m has vanished almost entirely 
(Künnap 1999: 13-14). With the only exception of a few nouns that belong to a subgroup 
of the second inflectional class and undergo stem alternation in the accusative paradigm 
(Siegl 2013: 121-124), singular direct objects morphologically conform to the correspond- 
ing singular subject nouns in that they are not case marked at all. As shown in (24), te 
'reindeer' gets along without any specific case marker regardless of whether it is selected 
as syntactic subject (cf. (24a)) or object (cf. (24b)). 


(24) Forest Enets (Northern Samoyedic; NOS. text 39.015, text 39.030) 
a. Te nebr-ió... 
reindeer run.away-3SG.RC 
“The reindeer runs away. 


b. ... to ar te kaóa-ó 
such size reindeer(acc)  kill-1sG.sc 


‘I have killed such a big reindeer: 


This holds true at least for all non-possessive forms. Their possessive counterparts 
still bear traces of the suffix *-m (Mikola 1988: 242). Owing to its coalescence with the 
respective adjoining possessive affixes, they exhibit portmanteau morphs encoding case 
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and possessor agreement that are — at least in the case of a second or third person pos- 
sessor specification — morphologically distinct from the respective possessor agreement 
morphemes attached to subject nouns. This is shown in (25) where the accusative third 
person dual possessor agreement suffix of (25b) deviates from its nominative counterpart 
in (25a) due to its previous fusion with *-m. 


(25) Forest Enets (Northern Samoyedic; Siegl 2013: 479-480) 
a. Kiuda Ser to-sau-jet sama-di?. 
morning(GEN) before come-PROB+PST(3SG.SC)-EMPH  beast-3DU(POSS) 
“But in the morning their bear apparently came? 
b. Oti-di? oti-Oi? bogl'a-di?. 
wait-3DU.OC  wait-3DU.OC  bear-Acc«3Du(Poss) 


“They waited for their bear. 


Like the dual subject and object forms the plural non-possessum subject (cf. 24a) and 
object (cf. 24b) forms are subject to a natural syncretism. This is due to the fact that after 
the de-morphologization and definite loss of the plural marker *-j, the former subject 
plural marker -? has entered the paradigm of plural non-possessive objects (Mikola 1988: 
238). 


(26) Forest Enets (Northern Samoyedic; Siegl 2013: 477, 479) 
a. can-da mi-n kari-?  toná-bi-é 
tub-GEN+3SG(POSS) in-Loc fish-PL  exist-PRF-PST+3PL.SC 
“(and) in a tub there were fishes” 


b. Salba ne-on kari-? noo-bi-s. 
ice(GEN) on-PROL fish-PL(ACC) take-PRF-PST+3SG.SC 


“Along the ice, the bear took fishes along. 


Since in Northern Samoyedic the possessor agreement affixes on plural nouns do not 
show any distinction with regard to the subject or object function of the correspond- 
ing arguments, the paradigm of the possessive plural nouns also lacks any nominative- 
accusative distinction. That is why the object kasióu ‘men’ in (27a) exactly matches the 
corresponding subject form in (27b). 


(27) Forest Enets (Northern Samoyedic; NOS. text.39.043, Languedoc. otpusk.029) 


a. Kasi-du d'oxara-? 
man-PL-3PL(POSS) not.know-3PL.sc 


“The men do not know each other! 


b. Kutui-du kasi-àu paroxodo-xoóo 
some(ACC)-PL+3PL(POSS) man(Acc)-PL+3PL(POSS) steamer-ABL 
karaa-t’i... 


take.along-3PL.SC+PST 


“They took along some of their fellows with the steamer: 
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Thus, with the only exception of a number of non-possessum singular nouns belong- 
ing to the second declensional class and of all singular accusative nouns displaying a 
second or third person possessor agreement affix in the singular number, objects are not 
distinguishable from subjects on the basis of their inflectional morphology. Like Tundra 
Nenets, Forest Enets dispenses with DOM on nouns entirely. Neither definiteness nor 
indefiniteness of direct objects is indicated by any special case marker or obligatorily 
associated with the presence or absence of any possessor agreement suffix. Exactly this 
is sketched in Table 8. 


Table 8: Structural case/definiteness markers on nouns in Forest Enets 


singular plural 
definite indefinite definite indefinite 
nominative - - -? -? 
accusative (sa) (sa) -? -? 


5.2 Hybrid forms in the paradigm of personal pronouns 


One thing that the Forest Enets pronominal system has in common with the Tundra 
Nenets pronominal system is that the introduction of the grammaticalized morpheme 
$i? has resulted in the removal of the structural case syncretism from the paradigm 
of personal pronouns. However, it differs from the Tundra Nenets system in that the 
new inflected forms of si? do not always simply replace the original syncretic pronouns. 
Rather they form an optional part of complex pronouns that also consist of the respective 
unmarked singular, dual and plural personal pronouns (Künnap 1999: 20-22). The corre- 
sponding paradigms of the structural cases are given in Table 91% the last two columns 
of which contain bipartite forms headed by a form of si?. 

Prokovjev (1937: 76) was the first who noticed the divergence of a number of Forest 
Enets personal pronouns from the corresponding genuine Uralic and Samoyedic pronom- 
inal items and their resemblance to personal pronouns used in the Yeniseian languages. 
Nowadays Uralists by and large agree that their second and third person nominative sin- 
gular forms have been directly borrowed from the Yeniseian language Ket (Terescenko 
1966: 456; Siegl 2008: 119-121). Their dual and plural forms are, like the correspond- 
ing first person forms, provided with common Uralic number markers (Siegl 2008: 124- 
127). Till this day they encode the person and number specification of the respective 
accusative forms as soon as they are not omitted. Consequently, with the exception of 
the second and third person singular and all first person and non-complex forms, the For- 
est Enets personal pronouns are hybrid forms. They are composed of hereditary Uralic 
and borrowed Ket morphemes. Accordingly, through borrowing and grammaticalization 


BNote that Siegl (2013: 186-187) — in contrast to Künnap (1999: 20-21) and Sorokina (2010: 227-229) among 
others - denies the existence of genitive personal pronouns in Forest Enets. 
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Table 9: Structural case paradigm of the Forest Enets personal pronouns (Kún- 
nap 1999: 21; Siegl 2013: 186-187) 


nominative accusative genitive 
1SG mod’ (mud") (mod") &i(j)? (mod’) sin 
2SG uu (uu) Sit (u) sit 
3SG bu (bu) šita (bu) sita 
IDUAL mod in? (mod ini?) sidin? (modif?) sidin 
2DUAL uudi? (uudi?) šiðði? (udi?) sióti? 
3DUAL bud'i? (bud i?) sidid i (budi?) siddi 
1PL mod'na? (mod na?) sióna? (modina?) sióna? 
2PL uuda? (uuda?) sidda? (uda?) sióta? 
3PL budu? (budu?) siddu? (budu?) siótu? 


Forest Enets has developed a suppletive paradigm of personal pronouns that, like the 
corresponding Nenets paradigm, features a morpheme-based distinction between the 
structural cases. 

In discourse situations the Forest Enets subject pronouns are optionally omitted in 
case they are not emphasized (Kúnnap 1999: 37). The corresponding object pronouns, 
however, are always overt with the partial exception of the third person forms. Like 
their Tundra Nenets counterparts, these pronouns are no longer fully realized as clausal 
arguments by the agreement morphology of finite verbs inflected in the objective conju- 
gation. Although the Forest Enets objective affixes still retain some essential properties 
of anaphoric third person objects, they have already gone one step further on the devel- 
opmental path to grammatical object agreement morphemes than the Nenets objective 
affixes. This is elucidated in the following section. 


5.3 Object definiteness marking on finite verbs 


The agreement markers of the three Forest Enets conjugation types are compiled in Ta- 
ble 10. 

With respect to the choice between the subjective and the objective inflection in the 
presence of pronominal direct objects Enets slightly deviates from Nenets. Like in Tun- 
dra Nenets, in Forest Enets SAP object pronouns, for example, the second person sin- 
gular accusative personal pronoun s'it ‘you’ in (28a), as well as indefinite third person 
pronouns, like the interrogative pronoun obu ‘what’ in (28b), are accompanied by finite 
verbs inflected in the subjective conjugation. 
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Table 10: Verbal suffixes of the subjective, objective and reflexive conjugation 
in Enets (Siegl 2013: 247-260) 


subjective objective reflexive 
singular dual plural 

1SG -ó? -Q,-u,-b  -xu-n -i-n -i -j?, -b? 
2SG -d -r -xu-ó -i-ó -i-d‘ 
3SG Ø -ğa -xu-ğa -i-da -i-07 
IDUAL -j2, -b? -j2, b? -xu-ń?  -i-ń?  -i-b? 
2DUAL -ri? -ri? -xu-ği?  -i-ği? -i-Ói? 
3DUAL -xi? -0i? -xu-Oi?  -i-0i2 -i-xi? 
1PL -a? ba? -a?, ba?  -xu-na?  -i-na?  -i-na? 
2PL -ra? -ra? -xu-Óa?  -i-Óa? -i-da? 
3PL -? -ğu? -xu-ğu?  -i-Óu?  -i-Ó? 


(28) Forest Enets (Northern Samoyedic; NOS. text 39.017, text 39.004) 
a. mod si-t kojta-da-Ó 
I yOU-ACC.SG  set.up-FUT-1SG.SC 
T will trick you? 
b. obu cken pon'i-ya-d 
what this-LoC.ADv do-FREQ-2SG.SC 


“What are you doing here?” 


Likewise, non-pronominal objects that are indefinite like nubai ‘a mat’ in (29a) and 
koba? ‘skins’ in (29b) require a finite verb form of the subjective paradigm. 


(29) Forest Enets (Northern Samoyedic; Siegl 2013: 47) 
a. Toégod čiki kadi láxáci ne-on gubai pu-da-?. 
then this fur(GEN) twig(Acc) on-PROL mat(acc) lay-FUT-3PL.sc 
“Then they will lay a mat on the fur twigs’ 
b. Nubai ne-on añ? čiki mu koba-?  láxta-da-? 
mat(GEN) on-PROL Foc this so  skin(PL) spread-FUT-3PL.sC 


“Over the mat, they will spread out skins. 


Also, like in Tundra Nenets, finite verbs inflected in the objective conjugation only 
co-occur with definite third person objects. But in Forest Enets, unlike in Tundra Nenets 
free definite third person object pronouns are not exempt from this. More precisely, if a 
third person definite pronoun, as for example any strong third person personal or any 
demonstrative pronoun, is inserted into a clause, the corresponding finite verb normally 
inflects in the objective conjugation. This is illustrated in (30a) and (30b). 
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(30) Forest Enets (Northern Samoyedic; Siegl 2013: 252, 468) 


a. Mud” sita soida-n tüne-u. 
I he(acc) good-PRor know-1sc.oc 
‘I know him well? 
b. Ciki-ru-óa | oo-ma-óa. 
this-LIM-38G eat-RES-35G.0C 
“Only this it had eaten? 


Nevertheless, the objective affixes of the Forest Enets verbal inflection are still able to 
represent anaphoric third person objects by the person features of their pronominal pre- 
decessors. Accordingly, they block the appearance of non-emphatic anaphoric third per- 
son personal pronouns for reasons of redundancy. Clauses, in which the third person 
definite pronominal object is not independently realized as in (31a) and (31b), are there- 
fore much more frequent than clauses like (30a). 


(31) Forest Enets (Northern Samoyedic; NOS. text 01.009, Siegl 2013: 269) 


a. Mod” nas’il tuda-a-b-o-s’. 
I noteasily recognize-PRS-1SG.OC-EP-PST 
‘I hardly recognized him. 

b. Sirta-b-i-da bocka mi-? 


salt-PRF-OBJ.PL-35G.0C  barrel(GEN) in-LAT 


“They salted them into a barrel? 


Forest Enets furthermore differs from Tundra Nenets in that the non-pronominal com- 
plements of objective predicates need not reside in the left area of the clause and do not 
even obligatorily refer to the discourse topic. 

In most cases the referent of lexical direct objects that complement an objective verb 
form is definite and at the same time topical insofar as it has been introduced in the 
preceding context. In (32) for example the reindeer and the mouse are established as 
protagonists at the beginning of the story (32a). In its conclusive statement (32b) the 
direct objects te ‘reindeer’ and tobik ‘mouse’ therefore belong to the old information. 
They are definite and their referents are highly topical. That is why te 'reindeer' and 
tobik *mouse' obligatorily co-occur with an objective verb form in (32b). 


(32) Forest Enets (Northern Samoyedic; NOS. text 39.001, Siegl 2013: 269) 


a. diri-bi no-l'u d'a-xan tobik an’ te 
live-NARR(3SG.SC) one-LIM earth-Loc.sG mouse and reindeer 


“There lived on the earth a mouse and a reindeer’? 
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b. te d'oxara-óa tobik, tobik d'oxara-óa 
reindeer notknow-3sc.oc mouse(Acc) mouse  not.know-35G.0C 
te 
reindeer(Acc) 


"Ihe reindeer does not know the mouse and the mouse does not know the 
reindeer. 


However, the definiteness of non-pronominal objects accompanied by a finite verb in- 
flected in the objective conjugation is not necessarily pragmatically motivated. Semantic 
definiteness is a sufficient criterion for direct objects to become a complement of an ob- 
jective verb form in Forest Enets. D’urak baða ‘Nenets language’ in (33b) and nu ‘door’ 
in (34b)"* for example are part of the new information (cf. (33a), (34a)). 


(33) Forest Enets (Northern Samoyedic; NOS. text 01.016, NOS. text 01.017) 
a. Mod’ onaj baóa-an sujóa-an d’uri-na-d. 
I true language-PROL.sG good-PROL.SG say-PRS-1SG.SC 
‘I speak Enets well? 


b. D'urak baða pubtore?  sujda-an tenee-w 
Nenets  language(Acc) also good-PROL.sG know-1sc.oc 


‘I also speak Nenets well? 


(34) Forest Enets (Northern Samoyedic; Siegl 2013: 489-490) 


a. Mud'na  okruzkom aga bem asi 
we(PL)  party.committee(GEN) big boss father(acc) 
máku-xuó-da mosa-xa-da kada-bi-da. 


house-ABL.sG-3sG(POSS)  work-rAT.sG-3sG(POss)  take-PRF-35G.0C 


“An official from our party committee came to take father from his house to 


work. 

b. Äsi-j pe-t kani-ta-§ nu  lokri 
father-1sc(poss) street-LAT  go-rFuT-PsT(3sG.sC) door suddenly 
toru-da 


close-35G.0C 


“My father went out on the street and suddenly closed the door. 


Owing to the uniqueness of the referent in the case of d'urak baóa and due to the 
evident associative relation of the object referent in the case of nu to an already imple- 
mented referent (here: the house of the father (cf. 34a)) they are definite as a result of 
the encyclopedic knowledge of the discourse participants. Their definiteness is therefore 
semantically motivated and triggers agreement in the objective conjugation, as can be 
observed in (33) and (34). 


“Siegl (2013: 490) himself points out that the combination of a future and a past tense marker is semantically 
unexpected. 
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Hence, the Forest Enets objective affixes, like the Tundra Nenets objective affixes, in- 
dicate specific properties of selected object arguments via a grammatical agreement re- 
lation. The Forest Enets verb takes an agreement suffix of the objective conjugation if 
its third person direct object deviates from the prototypical patient argument in being 
definite. Supported by its object number specification it establishes the basic syntactic 
function of the occurring nominal expressions, which, by and large, have lost their struc- 
tural case morphology. Accordingly, the relation between the Tundra Nenets and the 
Forest Enets objective suffixes is characterized by an increase of syntactic obligatoriness 
and the grammaticalization from pragmatic definiteness to semantic definiteness mark- 
ing (cf. Lehmann 1982: 57; Himmelmann 1997: 39). That is why Forest Enets DOI does 
not merely reflect pragmatic characteristics of the selected third person objects like the 
Nenets objective agreement marking. Rather it also fulfills a discriminatory function in 
that it distinguishes between arguments and their roles. 


6 Conclusion 


It has been shown in this paper that in the Northern Samoyedic languages Nganasan, 
Tundra Nenets and Forest Enets the grammaticalization of objective agreement mark- 
ers on verbs goes hand in hand with the specific development of accusative case and 
definiteness markers on nouns. 

The north eastern language Nganasan has brought forth a system of DOM that exclu- 
sively applies to nouns. This is due to various phonological processes that have affected 
accusative case markers and to the grammaticalization of possessor agreement affixes to 
definiteness markers. The agreement markers on Nganasan finite verbs do not yet serve 
as DOM in the proper sense. The objective affixes of them incorporate anaphoric third 
person object arguments. They only co-occur with free object constituents if they are 
bound by the latter in a typical clitic left-dislocation construction. In the north western 
language Tundra Nenets DOM of nouns does not exist. Uniform accusative case mark- 
ing prevails instead and nominal possessor agreement markers predominantly specify 
possessivity relations between possessum nouns and possessors. However, the agree- 
ment morphemes of the Tundra Nenets objective conjugation have adopted functional 
features of object agreement markers that enable them to reflect the non-typical behav- 
ior of syntactic objects in information structuring. In this way, the inflectional system 
of the Tundra Nenets finite verbs has acquired the function of DOI by a process of gram- 
maticalization. In Forest Enets, the central Northern Samoyedic language, the agreement 
morphemes of the objective conjugation already exhibit evident features of full-fledged 
head-marking verb suffixes. They indicate the presence of a definite third person direct 
object. Since Forest Enets differs from Tundra Nenets in that the mere structural case 
marking on its nouns is becoming extinct, the choice of the respective verbal agreement 
allomorph in Forest Enets serves to distinguish between clausal arguments and their 
roles. 

Since the Uralic SAP pronouns are neither immediately affected by the emergence 
and loss of nominal differential object markers nor involved in the grammaticalization 
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of the objective agreement suffixes on verbs, the Northern Samoyedic system of personal 
pronouns has developed independently. In Tundra Nenets and Forest Enets it has under- 
gone a significant dimensional progression. In contrast to Nganasan, which employs a 
system of morphological realization and non-realization drawing a distinction between 
pronominal agent and patient arguments, Tundra Nenets and Forest Enets have gram- 
maticalized the morpheme si? which nowadays represents the direct object forms by 
suppletion. This is summarized in Figure 2: 


Distinctive case Definiteness Distinctive case Verbal 
marking on marking on marking on object objective 


object nouns | object nouns | personal pronouns suffixes DOM 


definiteness 
AVAM non-uniform (singular) 3*3 person 
NGANASAN | accusative case | (in)definiteness pronouns 
(plural) 


4 case marking 
(DOC) 
possessive 
marking 


incorporated 3" 
person object 
pronoun 


verbal 
agreement 
(DOD 


TUNDRA uniform object topic 


all pronouns 


NENETS accusative case marker 


object verbal 
all pronouns definiteness agreement 
marker (DOI) 


Figure 2: The development of structural case marking on nouns and pronouns 
and of the objective conjugation in Northern Samoyedic 


Abbreviations 

1 first person DOC differential object case marking 
second person DU dual 

3 third person DUR durative 
ABL ablative EMPH emphasis 
ACC accusative EP epenthetic vowel 
ADV adverbial suffix ESS essive 
AOR aorist EXCL exclamative 
AUGM  augmentative FOC focus marker 
CAR caritative FREQ frequentative 
CAUS causative FUT future 
CNEG  connegative GEN genitive 
DAT dative HAB habituative 
DEST  destinative IMPFUT imperative future 
DIM diminutive INCH inchoative 
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INF infinitive PST past tense 

INTER interrogative marker PSTPF past perfect 

IPFV imperfective PF present perfect 

ITER iterative PL plural 

LAT lative POSS possessive 

LIM limitative PRF perfect 

LOC locative PROB  probabilitative 

MOD modal gerund PROL  prolative 

NARR narrative PRS present continuous 
NEGAUX  negation auxiliary PTCP participle 

NMLZ nominalizer RC reflexive conjugation 
NOM nominative RES resultative 

O object SC subjective conjugation 
oc objective conjugation SG singular 

PASS passive SUP supine 


Data sources: 


Stories Hea uyma (Languedoc. dva éuma), Kak ymonynu peóama (Languedoc. reb- 
jata), Kak ceopeza nawa wxona (Languedoc. skola) from the online corpus of the 
project “Languedoc”, available at http://www.philol.msu.ru/-languedoc/rus/ngan/ 
corpus.php [accessed on June, 24, 2017] 


Story Omnyck (Languedoc. otpusk) from the online corpus of the project “Langue- 
doc”, available at http://www.philol.msu.ru/-languedoc/rus/enets/corpus.php [ac- 
cessed on June, 24, 2017] 


Stories Kehy Luu (NOS. kehy luu), Mou Djamezi (NOS. mou djamezi) from the 
online corpus of the project "Negation in Ob-Ugric and Samoyedic Languages”, 
University of Vienna, available at 
http://www.univie.ac.at/negation/sprachen/nganasanischa.html [accessed on June, 
24, 2017] 


Stories Tesjada Nisjami (NOS. tesjada nisjami), Tet Weli Teta (NOS. tet weli teta) 
from the online corpus of the project “Negation in Ob-Ugric and Samoyedic Lan- 
guages”, University of Vienna, available at 
http://www.univie.ac.at/negation/sprachen/nenzischa.html [accessed on June, 24, 
2017] 


Stories Text 1 (NOS. text 01), Text 39 (NOS. text 39) from the online corpus of 
the project “Negation in Ob-Ugric and Samoyedic Languages”, University of Vi- 
enna, available at http://www.univie.ac.at/negation/sprachen/enzischa.html [ac- 
cessed on June, 24, 2017] 
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Chapter 13 


Differential A and S marking in Sumi 
(Naga): Synchronic and diachronic 
considerations 


Amos Teo 


University of Oregon 


This paper presents data on the argument marking system of Sumi, a Tibeto-Burman lan- 
guage of Nagaland, and examines the possible diachronic sources of differential A and S 
marking in the language. In Sumi, there is a two-way contrast for A arguments (-no vs. =ye) 
and a three-way contrast for S arguments (=no vs. =ye vs. unmarked). I examine the triggers 
of such differential marking, looking at semantic factors associated with transitivity, as well 
as pragmatic factors associated with information structure. 


In transitive clauses, =no is more commonly found on A arguments, where it marks a seman- 
tic agent, while =ye on A arguments signals a lack of agentivity. In intransitive clauses, =no 
on S arguments marks contrastive focus, while =ye marks a contrastive topic, or sometimes 
continuing reference. 

Based on available synchronic data from Sumi and related languages, I examine the possi- 
bility that one source for the marker =ye is an old locative marker. I also examine potential 


sources for the marker -no, which has cognates across the language family that function as 
agentives or ergatives, as well as instrumentals and ablatives. 


1 Introduction 


Sumi, also known as Sema or Simi, is a Tibeto-Burman language spoken in Nagaland, 
North-East India. Like many other Tibeto-Burman languages of the area, Sumi displays 
semantically and pragmatically motivated differential A and S argument marking.! This 
type of differential argument marking is not unusual for the area, where it appears that 
semantic and pragmatic factors play a major role in the distribution of what is sometimes 


1n this paper, I follow Dixon (1994)'s use of the terms A, S and O to refer to: the subject of transitive clause, 
the subject of an intransitive clause; and the object of a transitive clause respectively. 


Amos Teo. Differential A and S marking in Sumi (Naga): Synchronic and di- 
achronic considerations. In Ilja A. Serzant & Alena Witzlack-Makarevich (eds.), Di- 
| achrony of differential argument marking, 345-362. Berlin: Language Science Press. 
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described as the ‘ergative’ or the ‘agentive’ in these languages. Similar patterns of argu- 
ment marking are found in other languages of Nagaland, including Mongsen Ao (Coupe 
2007; 2011), and more generally across Tibeto-Burman (see DeLancey 2011; Chelliah & 
Hyslop 2012). 

What is unusual about Sumi, at least for a Tibeto-Burman language of the area, is 
that I find a two-way distinction with A marking: the choice of two enclitics: =ye and 
-no; but a three-way distinction with S marking: the choice of the two enclitics: =ye 
and =no, and no overt morphological marking. In addition, O arguments are unmarked. 
By comparison, two closely related languages Khezha and Mao display the more typical 
“optional ergative’ system, where A and S may take an overt ‘ergative’ marker vs. no overt 
morphological marking; in addition to differential O argument marking. For instance, 
Khezha has an ‘optional’ ergative marker nú (glossed ‘nominative’ by Kapfo 2005) on A 
arguments, as well as an ‘optional’ patientive / locative marker eh /é/ on O arguments. 

Traditional accounts of differential argument marking have focused on differential 
object marking and the role of animacy and definiteness (e.g. Bossong 1983; 1985; Aissen 
2003). More recent work on differential argument marking has also looked at the role 
of information structure (e.g. Dalrymple & Nikolaeva 2011; lemmolo & Klumpp 2014). 
Comparatively fewer studies have examined differential subject marking / differential 
agent marking / optional ergativity, with notable exceptions such as de Hoop & de Swart 
(2008) and McGregor & Verstraete (2010). Although differential subject marking has been 
assumed to be the mirror counterpart of differential object marking, there is evidence 
suggesting that the triggers of both are not identical (Malchukov 2008; Fauconnier 2011). 

Malchukov (2008) also argues that the indexation of animacy is simply an epiphe- 
nomenon associated with the expression of two potentially competing functions: the 
indexation of semantic roles, and the differentiation of subjects from objects. Faucon- 
nier (2011) considers the role of animacy, but rejects the notion that “Agents” (defined 
as participants in the A role) and “Objects” can be placed on a single animacy hierarchy 
(as per Silverstein 1976). Rather, she suggests that unexpectedness plays a crucial role 
in differential agent marking, where for instance, inanimates are not expected to act as 
Agents and may receive special morphological marking or be restricted from appear- 
ing as Agents. Similarly, McGregor (2010) shows that in Gooniyandi and Warrwa, the 
absence of an ergative morpheme on an A argument marks an unusual or unexpected 
A. 

Similarly, it will be shown that in Sumi, differential A and S marking is not triggered 
by some inherent animacy of the referent, but by the interaction between situational fac- 
tors such as agentivity, defined by the degree of volition, control and purpose associated 
with a referent in a particular situation; and discourse pragmatic functions, including 
the marking of contrastiveness and unexpectedness. However, this notion of ‘unexpect- 
edness' is primarily about the management of listener-based and/or speaker-based ex- 
pectations. 


?LaPolla (1995) distinguishes 'ergative' from ‘agentive’ marking thus: the former is ‘systematic’ (in others 
words, the A argument is consistently marked); while the latter is ‘non-systematic’ (in other words, what 
one might call “differential A marking' or “optional ergativity”, e.g. Chelliah & Hyslop 2012; McGregor 2010). 
In this paper, I use the terms 'ergative' and 'agentive' in a similar fashion to LaPolla. 
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In 82, I first give some background on Sumi and describe the circumstances under 
which argument marking is obligatory in the language. In 83, I then describe some of 
the triggers of differential A and S marking. In $4, I consider the diachronic origins of 
these markers by presenting both language-internal and cross-linguistic evidence. Fi- 
nally, in 85, I summarize the findings and consider future avenues of inquiry. 


2 Language background 


Sumi, also known as Sema, is a Tibeto-Burman language spoken by an estimated 104,000 
speakers (Lewis et al. 2013) mainly in Nagaland, North-East India. Burling (2003) classi- 
fies Sumi as a member of the Angami-Pochuri group, along with Angami, Khezha and 
Mao. Many Sumi speakers also speak English, as well as Nagamese, an Assamese-based 
creole. The canonical word order of Sumi is AOV / SV, like other Tibeto-Burman lan- 
guages of the region. In Sumi transitive clauses, A arguments must be marked by either 
=ye or =no,? while O arguments are typically unmarked morphologically when they oc- 
cur right before the verb, as seen in (1)-(3). 


(1) Sumi (Tibeto-Burman; Kutili bird. short, Line 23)* 
[a-zii=no] 4 [küma]o  yipesü-u-ve? 
NRL-water-no]A  [3DU0O sweep-go-vM 


*... the water swept them both away: 


(2 *a-zú küma yipesü-u-ve 


(3) Sumi (Tibeto-Burman; Origin of axone, line 5) 
[küma-ye]A  [a-kishina]o | chu-kha-mo-ve-ke-hu 
3DU-ye]A [NRL-lunch]g  eat-NCPL-NEG-VM=NZR=DIST 


*... they were unable to finish their lunch ..? 


First and second person singular pronominal O arguments are realized as proclitics 
on verbs, as in (4) and (5). 


(4) Sumi (Tibeto-Burman; elicited) 
Pa=no  o=he. 
3sG=no0  2sG=hit 


“He hit you. 


There is in fact a third option, the additive ghi ‘also’. Additionally, speakers may choose to omit an NP 
altogether. However, these will not be discussed in this paper. 

?Examples from texts can be found at http://catalog.paradisec.org.au/repository/ABT1. Cite as: Amos 
Teo (collector). 2008. Sumi (India) (ABT1), Digital collection managed by PARADISEC. DOI: 
10.4225/72/56E7A73CE9FA7 

"ln this paper, examples are given in the working orthography, which does not consistently mark tones. The 
graph ú represents a high central unrounded vowel /i/. 

$Note that these pronominal proclitics are identical in form to the possessive pronominal prefixes. 
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(5) Sumi (Tibeto-Burman, elicited) 
No-no  i-he. 
2s5G-no  isG-hit 


"You hit me: 


In intransitive clauses, S arguments can be morphologically unmarked, as in (6), or 
marked with either =ye or =no, as in (7) and (8) respectively. 


(6) Sumi (Tibeto-Burman; Origin of axone, line 3) 
Küma  a-lu-lo hu-niye-ke-lo 
3DU NRL-field=Loc  go.field-PROs-NZzR-LOC 


“While the two were about to go to the field ..? 


(7) Sumi (Tibeto-Burman; Telephone conversation01, line 4) 
O Kivi=ye zü a-phi. 
EXCL Kivi=ye sleep PROG-CONT 


“Oh, Kivi is still sleeping’ (27 mention) 


(8) Sumi (Tibeto-Burman, elicited) 
Kivi-no zü a-ni. 
Kivi=no sleep PROG-NPST 


‘Kivi (not someone else) is sleeping: 


The obligatoriness of argument marking therefore depends largely on clause type: 
A arguments (and as I shall demonstrate, the first NPs in non-verbal clauses) must be 
accompanied by either =ye or =no, while S arguments may be marked by =ye, =no or be 
morphologically unmarked. In all cases, the choice of marking, or lack thereof, depends 
largely on semantic and pragmatic factors. These will be examined in the next section. 


3 Triggers of differential A and S marking 


In this section, I describe some of the triggers of differential A and S marking in Sumi. 
The analysis presented here is a summary of the one presented in Teo (2012). Generally, 
in transitive clauses, situational characteristics of arguments like control and volition 
largely determine the choice of =no or =ye. In intransitive and non-verbal clause types, it 
seems that discourse characteristics like topicality, contrastiveness, focus, and perhaps 
unexpectedness are the main triggers. However, there are cases where such discourse 
characteristics appear to also influence differential A marking in transitive clauses, while 
certain situational characteristics of arguments may also be relevant for differential S 
marking. 
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3.1 Transitive clauses 


As mentioned in the previous section, A arguments in clauses with two or more core 
arguments must take either =no or =ye. The use of =no in such clauses is often associated 
with an agent that has a high degree of volition, control and purpose. For instance, in (9)- 
(11), =no marks a volitional and purposeful A that is able to effect a change in the world. 
Note that in (10), the river is regarded as a supernatural force that has been actively 
preventing a mother from making a crossing with her baby, and eventually sweeps them 
away when they attempt to cross. In contrast, =ye often marks experiencers, which are 
characterized by having a low degree of volition and control over an action, as in (12) 
and (13) These features of the A argument: volition, control and purpose, are in line 
with some of the components proposed by Hopper & Thompson (1980) in their analysis 
of semantic transitivity. 


(9) Sumi (Tibeto-Burman, elicited) 
I=no a-lhache ` he-qhi-ve. 
isG-no  NRL-ant  hit-kill-vm 


‘I killed an ant! 


(10) Sumi (Tibeto-Burman; Kutili_bird_short, line 23) 
a-zii=no küma  yipesú-u-ve. 
NRL-water=no 3DU sweep-go-VM 


^... the water swept them both away: 


(11) Sumi (Tibeto-Burman, elicited) 
Ni-nga-no kuu  shi-va kea? 
lPL-child-no what do-PRF Q 


"What has our daughter done?' 


(12) Sumi (Tibeto-Burman; Kutili bird short, line 27) 
Ni-ye  ni-nga-sütsa chu-mla-va-i. 
1lsc-ye  1PL-daughter=voice  hear-NCAP-PRF-EMPH 


‘Ino longer hear any news from our daughter? 


(13) Sumi (Tibeto-Burman; Kutili bird short, line 26) 
Ni-nga-ye kuu  shi-va kea? 
Ipi-child=ye what do-ppr Q 


“What has happened to our daughter?” 


Certainly, in most of these examples, the degree of agentivity of the A is closely linked 
to the lexical verb: =no is preferred on the A argument with canonical transitive verbs 
like ‘kill’, as in (9), where A has a higher degree of agentivity; while =ye is preferred on A 


In (13), the argument marked by =ye would not be considered to be an A argument, but rather an experi- 
encer/locative subject. 
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with verbs of passive perception like ‘see’ or ‘hear’, as in (12), where A has a low degree 
of agentivity. 

With some verbal predicates, the choice of =no or =ye on A corresponds to a specific 
sense of the verb. For example, (14), where A is marked with =ye, describes a scene where 
the referent is not in control of the action. One could interpret pele as “to spill’ or ‘to 
bleed”. In contrast, in (15), where A is marked with -no, the verb pele has more of a 
causative interpretation: 'cause to spill'. 


(14) Sumi (Tibeto-Burman; elicited) 
Pa-ye  aji pele-ve. 
3sG=ye NRL-blood  spill-vm 


“He was bleeding. 


(15) Sumi (Tibeto-Burman, elicited) 
Pa-no  aji pele-ve. 
3sc-no  NRL-blood  spill-vm 


“He threw away blood: 


With some verbal predicates, as in (16), =no is the expected marker on A if one assumes 
that the chief should have authority among his people. In comparison, in one possible 
interpretation of (17), the use of =ye suggests that A is a less effective agent, i.e. a chief 
who cannot make his people obey him even though he gave an explicit command. Note 
that animacy and definiteness do not appear to affect the choice of =no or =ye in these 
examples. 


(16) Sumi (Tibeto-Burman; elicited) 
A-kü-ka-u-no a-zah tsü-ve. 
NRL-NZP-Tule-DEF=n0  NRL-command  give-vM 


“The chief gave a command: 


(17) Sumi (Tibeto-Burman, elicited) 
A-kü-ka-u-ye a-zah tsü-ve. 
NRL-NZP-Tule-DEF=ye  NRL-command  give-vM 
"Ihe chief gave a command? (One interpretation: has a sarcastic reading and 
implies no one obeyed him.) 


Transitive clauses show a split-A system, where -no typically correlates with higher 
agentivity and -ye with lower agentivity, such as experiencers with a low degree of 
volition and control.’ However, the agentivity of the A referent cannot always explain 
the distribution of the morphemes -no and =ye. It is important to note that the sentence 
in (17) could also be interpreted without sarcasm as (Someone else did something), as 
for the chief, (he) gave a command: This, as well as evidence from intransitive clauses 


5A prototypical “experiencer”, as per Payne (1997: 50), is “an entity that receives a sensory impression, or in 
some other way is the locus of some event or activity that involves neither volition nor a change of state” 
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(see next section), suggests that =ye can also function as a kind of topic marker in some 
transitive clauses. 

In narratives, it is not always easy to tease apart the various functions of -ye. For 
example, in (18)? -no occurs in the first clause, which describes how two sisters made 
axone, a popular Sumi dish of fermented soya beans, for the very first time. In contrast, 
=ye is found in the second clause, which describes how Sumis then habitually cooked 
the dish. Although the use of =no and =ye does not appear to be motivated by situational 
characteristics relating to volition and control of the participants, one might still argue 
that according to Hopper & Thompson's (1980) criteria, the first clause displays a higher 
degree of transitivity than the second, since the former refers to the first (telic and punc- 
tual) instance of an event, while the latter refers to a repeated event that is atelic and 
non-punctual. On the other hand, in an alternative analysis that assigns greater impor- 
tance to discourse factors, =no highlights that this was a newsworthy event, and that it 
was this pair of sisters, not anyone else, who instigated the first instance of the event; 
while =ye is used in the second clause to set up a change in A argument from the two 
sisters to Sumis in general. 


(18) Sumi (Tibeto-Burman; Origin of axone, lines 17-20) 


Tishi-no [küma-no  a-xone lho-chu-phe-püzü-no] 
like.that-no [3DU=no NRL-ferm.soya.beans  cook-eat-start-CONJ-CONN] 
tingu-no a-la-u-ye Sümi-qo-ye  a-xone 


because.ofthat-no NRL-path-DEF=ye [Sumi=PL=ye  wRr-ferm.soya.beans 
lho-chu-u-ve]. 

cook-eat-go-vM] 

‘Henceforth, the two (sisters) started to cook and eat axone (a fermented soya 
bean dish) and consequently from then on, the Sumis have cooked and eaten 

axone. 


In some cases, it may be difficult to tell if =no is marking an agent or some kind of 
contrast. For example, in (19), the A argument has volition and control, which may ex- 
plain the appearance of =no. However, it is also possible that the use of =no is associated 
with counter-expectation, i.e. the event that is instigated by A is not expected given the 
known circumstances, if one assumes that having children gives a husband less reason 
to abuse his wife. 


(19) Sumi (Tibeto-Burman; Kutili bird short, lines 6-7) 
a-tianu a-u-ve=mu [a-kimi=no li=sapúsa] 
NRL-children  Exisr-go-vM-NEG NRL-husband=no  3sc.r-mistreat] 


*... despite having children, the husband mistreated her: 


In general, the degree of agentivity of A seems to be the more important factor in the 
choice of =no or =ye. A corpus study is currently being done to investigate the extent to 


?It should be noted that =no and -ye can also occur on adverbial adjuncts. This will be discussed further 
in 84.4. 
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which the choice of =no or =ye is determined by the number of core arguments licensed 
by a verb, the semantic roles assigned by a verb, and the animacy of A. 


3.2 Intransitive clauses 


In intransitive clauses, the first time an argument is mentioned in discourse, it can be 
morphologically unmarked, as in (20). However, if an S is being contrasted with another 
S, it takes =ye, which marks it as a contrastive topic, i.e. “as for this S, S did something, 
as in (21). 


(20) Sumi (Tibeto-Burman; Telephone conversation01, line 4) 
Kivi zü a-ni. 
Kivi sleep PROG-NPST 


1% mention) 


“Kivi is sleeping? ( 
(21) Sumi (Tibeto-Burman; Telephone_conversation01, line 7) 

O Kivi=ye zü a-phi. 

EXCL Kivi=ye sleep PROG-CONT 


‘Oh, Kivi is still sleeping’ (27d mention) (Kivi was previously mentioned, but the 


speaker then switched to talking about her other son, before switching back to 
talking about Kivi) 


S is unmarked after having been introduced in a previous presentational clause, as 
in (22), which follows the opening line: ‘Once upon there were two sisters’. Here, the S 
argument kiima is not marked with =ye because the two sisters are not being contrasted 
with anyone else in the story. 


(22) Sumi (Tibeto-Burman; Origin_of_axone, line 3) 
Küma  a-lu-lo hu-niye-ke-lo 
3DU NRL-field=10C  go.field-PrROoS=NZR=LOC 


“While the two were about to go to the field ..? 
Importantly, S is always marked with =ye in elicited sentences, such as (23). 


(23) Sumi (Tibeto-Burman; elicited) 
A-kulu-ye  ighi=va. 
NRL-light=ye come-PRF 
“The power has come back. 
This illustrates how only in data collected in more naturalistic contexts, i.e. from con- 
versations and narratives, can S be morphologically unmarked. When working with 


recorded texts, if speakers are asked to repeat sentences produced in such texts, they 
will sometimes add =ye to S arguments, even in cases where =ye was not found with 
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S in the original text. This suggests that the use of =ye in intransitive clauses is associ- 
ated with some discourse pragmatic function, such as continuing topic, than with the 
marking of the semantic role of experiencer, as was described for transitive clauses. 

In addition, S arguments can be marked by -no. The use of -no here, rather than 
marking the semantic role of agent, typically marks some kind of focus on the argument. 
For example, in (24), =no is used when S is the answer to a question. It can also be used 
to highlight contrastive focus, i.e. “this S, not any other one, as well as corrective focus, 
Le. “this S, not the one you think it is”. 


(24) Sumi (Tibeto-Burman; elicited) 
Pa-no  nu-va. 
3sc-no  laugh-PrF 


“He laughed (not anyone else)? (answers the question: “Who laughed?") 


In some situations, S is marked with =no, with no obvious contrastive focus reading. 
An example is given in (25), which describes God's descent to Earth in the biblical story 
of the Tower of Babel. 


(25) Sumi (Tibeto-Burman; Sumi Bibel Genesis 11:5) 
A-mpeu=no  iqi-e. 
NRL-lord-no ^ descend-EMPH 


*.. the Lord came down: 


The ongoing corpus study will also look at how frequently =no occurs with S and what 
factors best account for its occurrence in intransitive clauses, since it is unclear whether 
-no is used in examples (25) because: (a) it signals a high degree of volition, control 
and purpose associated with the referent, i.e. an omnipotent being; or (b) it marks some 
degree of surprise or counter-expectation for the action performed by S; or (c) it is a 
combination of these two and other factors. 


3.3 Non-verbal clauses 


Non-verbal clauses are also worth mentioning in a discussion of differential argument 
marking in Sumi. There is no copula verb in the affirmative present tense and in such 
clauses, the first NP is obligatorily marked by either =ye or =no. In pragmatically un- 
marked statements, the subject requires =ye, cf. (26) and (27). If the first NP is marked 
with -no, as in (28), corrective focus or contrastive focus reading is obtained, similar to 
the use of -no with S arguments in intransitive clauses. This particular example came 
about when a speaker corrected the researcher for assuming that the father of a person 
of mixed ancestry in the town was Sumi - in fact, it was the mother who was Sumi. 


(26) Sumi (Tibeto-Burman; elicited) 
Pa-za=ye Súmi. 
3sG-mother=ye Sumi 


“His mother is Sumi. 
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(27) *Pa-za Sümi. 


(28) Sumi (Tibeto-Burman; natural conversation, unrecorded) 
Pa-za-no Sümi. 
3sc-mother-no Sumi 


“His mother is Sumi (i.e. not his father, not anyone else) 


Unlike in the previously discussed clause types, the choice between -no and -ye in 
equative clauses cannot be attributed to differences in the semantic transitivity of the 
clause. Rather, it is discourse pragmatic factors that seem to condition the distribution 
of =no and =ye, with the former used to mark contrastive or corrective argument focus 
while the latter is used to mark either a new, contrastive or continuing topic. 


3.4 Summary of triggers of differential argument marking 


A summary of the functions of =no and =ye by clause type is given in Table 1. 


Table 1: Summary of functions of =no and =ye by clause type 


Clause type -no =ye unmarked 
Transitive clauses ‘agent’ - high 'experiencer' — [not possible] 
degree of control, low degree of 
volition, purpose control, volition, 

etc. purpose etc. 
Intransitive ‘focus’ — “topic” - first mention of 
clauses contrastive / contrastive, referent 
corrective continuing 
Non-verbal ‘focus’ — “topic” - new, [not possible] 
clauses contrastive / contrastive, 
corrective continuing 


It appears that situational characteristics of arguments like control and volition play a 
large role in differential A marking in transitive clauses, while discourse characteristics 
like topicality and contrastiveness play a large role in differential argument marking in 
intransitive and equative clauses. Nevertheless, it is important to note that this distinc- 
tion is not as clear-cut as it appears in Table 1. As previously shown, there are examples 
that suggest that discourse characteristics like focus and unexpectedness may play a role 
in determining differential A marking even in transitive clauses, while situational charac- 
teristics like volitionality and control may also determine differential S marking in some 
intransitive clauses. Crucially, it should be noted that certain features of referents like 
animacy and definiteness do not seem to play a large role in differential argument mark- 
ing in Sumi. Certainly, such features interact with notions of discourse prominence and 
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expectedness, but any apparent indexation of these features could simply be regarded as 
epiphenomenal. 


4 Origins of differential A and S marking 


Having looked at some factors governing the synchronic pattern of differential argument 
marking in Sumi, let us now consider the diachronic origins of the relevant markers. 
Given that the primary functions of =no and -ye differ by clause type, and that differ- 
ent clause types differ in terms of the obligatoriness of argument marking, it would be 
prudent to consider the origin of the =ye and =no markers in each clause type separately. 


4.1 Origins of =ye in transitive clauses 


It was shown earlier that experiencers in transitive clauses are typically marked by =ye, 
as in (29). 


(29) Sumi (Tibeto-Burman; Kutili bird short, line 27) 
Ni-ye  ni-nga-sütsa chu-mla-va-i. 
lsG=ye 1PL-daughter=voice  hear-NCAP-PRF-EMPH 


‘Ino longer hear any news from our daughter? 


There is some language-internal evidence that points to a locative as the source of 
this marker, even though the synchronic locative marker in Sumi is lo. In predicate pos- 
session clauses, such as (30), the possessor is marked with =ye. The possessor is then 
followed by the possessee and an existential verb ani or ache. The structure of such pred- 
icate possession clauses parallels that of existential clauses, as in (31), where the location 
aghuloki lakhi lo is marked with the synchronic locative lo, followed by the entity in 
question and an existential verb. 


(30) Sumi (Tibeto-Burman; elicited) 
Ni-ye  a-tsü a-ni. 
lsG=ye NRL-dog  EXIST-NPST 


‘T have a dog: 


(31) Sumi (Tibeto-Burman; Origin of axone, line 2) 
Khaghi  a-ghuloki lakhi-lo  a-tsünipu kini a-che-ke-ti ... 
Jong ago NRL-time.period one-Loc  NwRL-sister two  EXIST-PST-NZR-MED 


“Once upon a time, there were two sisters ..? 


Given that Sumi does not appear to have a separate verb meaning ‘to possess’, but 
rather the same existential verb root a- in both clause types, this suggests that the =ye 
that is found on the possessor was once used to indicate location. The use of a locative 
subject in possessive constructions are common in Tibeto-Burman, but are also found in 
other languages of the world (see Clark 1978; Stassen 2013). 
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Similarly, in constructions that express “to like”, as in (32), the liker is typically marked 
with =ye. What looks like a verb meaning ‘to like’ alo has the internal structure of a noun 
meaning ‘goodness’ or ‘good’, and has the same nominal prefix a- found in the citation 
form of most nouns in Sumi. This would suggest that the origin of this construction is 
possibly a locative construction that may be translated literally as 'At you, axone is (usu- 
ally) good?' The verb cheni marks the existence of a habitual state and in some contexts 
can be used interchangeably with the existential verb ani. 


(32) Sumi (Tibeto-Burman; natural conversation, unrecorded) 
No-ye  a-xone a-lo che-ni kea? 
25G=ye NRL-fermented.soya.beans NRL-good HAB-NPST Q 


‘Do you like axone (fermented soya bean dish)?’ 


The use of locative constructions to code experiencer “subjects” is well attested in the 
languages of South Asia (see Verma & Mohanan 1990), including Tibeto-Burman lan- 
guages of the area, such as Meithei (Chelliah 1997: 108) and Tshangla (Andvik 2010: 142). 
In these languages, locative (as well as dative) case marking is also found on posses- 
sor subjects in copular clauses. The second argument in these clauses is usually in the 
absolutive case, which is typically morphologically unmarked. 

Preliminary comparative data from other Angami-Pochuri languages further suggest 
that Sumi =ye derives from an old locative marker. In Khezha, the locative marker is eh 
18/10 as seen in (33), while in Mao, the locative marker is yi, as seen in (34). It is possible 
that both these markers are cognates with Sumi -ye, although more work is to be done 
to establish their cognacy by examining regular sound correspondences between these 
languages. 


(33) Khezha (Tibeto-Burman; Kapfo 2005: 286) 
Mary nü ketsü eh beh a. 
Mary NOM garden LOC EXIST PART 


“Mary is in the garden. 


(34) Mao (Tibeto-Burman; based on Giridhar 1994: 185) 
Athikho Lokho-yi  kahie. 
Athikho  Lokho-roc  be.close? 


“Athikho is close (in spatial distance) to Lokho. 


Given the above evidence, it would therefore be reasonable to hypothesize that an old 
Angami-Pochuri locative is the origin of Sumi =ye, at least in transitive clauses. 
However, it should also be noted that in Khezha and Mao, O arguments appear to 
optionally take locative markers, i.e. there is a contrast between an overt marker and a 
lack of marking, though the triggers for such differential marking are not well described. 


10The grave accent marks low tone in Khezha. 
NGiridhar (1994) does not provide morpheme-by-morpheme glosses for his examples. All glosses for exam- 
ples from Mao have been added based on his grammatical description and examples given in the grammar. 
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Examples where O arguments are overtly marked are given in (35)-(37). It is unclear 
if these markers really do mark semantic patients / grammatical objects vs. semantic 
locations, since they usually occur with contact verbs, e.g. meke “to bite” or a compound 
based on a contact verb, e.g. meke-thru ‘to kill by biting’. However, in Mao at least, the 
locative with O is also used with the verb 'to love”, as in (37), suggesting it has started to 
mark O arguments more generally. 


(35) Khezha (Tibeto-Burman; Kapfo 2005: 288) 
Cotsü nü coha eh ` meke-thru dah. 
blackant Nom red.ant roc  bite-kill PART 


“A black ant has killed a red ant. 


(36) Mao (Tibeto-Burman; based on Giridhar 1994: 180) 
Nili-no  Nisa-yi da pie. 
Nili-ERG Nisa-Loc beat give 


‘Nili beats Nisa. 
(37) Mao (Tibeto-Burman; based on Giridhar 1994: 184) 
Ai Athia-yi le shüe. 
1$G.NOM Athia-Loc love 
‘I love Athia’ 


While it is still uncertain what the exact triggers for such differential O marking in 
Khezha and Mao are, for the purposes of this paper, it is simply important to note the 
shift from a locative to what is starting to look like a patientive marker. Similar patterns 
have been noted in other Tibeto-Burman languages of South Asia, including Tshangla 
(Andvik 2010: 156), where the locative / dative ga may occur on an experiencer or goal 
patient. 

It therefore appears that one source for =ye on A arguments in transitive clauses is 
the reanalysis of locative experiencers/patients as experiencer As. The function of =ye 
was then extended to non-agent-like As, possibly because it was then in contrast with 
the agentive marker =no. This would be a Sumi-specific innovation not found in other 
Angami-Pochuri languages where the locative optionally marks O arguments.!? 


4.2 Origins of =ye in intransitive and non-verbal clauses 


In the previous section, we saw how a locative might have developed into an experiencer 
A marker. In intransitive clauses, the same locative marker might have developed into 
a topic marker. However, the latter is not a widely attested grammaticalization pathway 


Din new data collected by the author, it turns out that there are some Sumi speakers who can optionally 
mark O arguments with the synchronic Sumi locative =lo. Little is known of the triggers of differential 
O marking, and preliminary data and speaker judgements shows much variation across the community: 
some speakers reject any marking on O arguments; some accept O marking only with verbs of contact; 
and others accept optional marking on O arguments in general. 
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and without sufficient language-internal and comparative evidence, I am left to speculate 
on the origins of -ye in intransitive clauses. 

One possible clue to the origins of =ye marking on S arguments may come from no- 
verbal clauses. As previously shown, the first NP in such clauses obligatorily takes =ye or 
-no. Synchronically, there is no copular verb in such clauses in the present affirmative. 

In contrast, in the related language Mao, Giridhar (1994) gives examples of equative 
clauses (which he calls "predicate phrases") where -ko-e is added to the second argument 
in the clause, as in (38). What is represented as the suffix -ko is identical in form to 
a verbal nominalizing prefix in the language. This would suggest that -e has a verbal 
origin — more specifically, a copular verb. 


(38) Mao (Tibeto-Burman; based on Giridhar 1994: 456) 
hihi a zhu-ko-e 
PRX 1SG name-ko-e 
“This is my name. 


This may lead one to wonder if Sumi also once used a copular verb in equative clauses.!* 
The pathway from copula to topic marker is not common, but it is attested. Harris & 
Campbell (1995: 165-166) give examples of copulas being reanalyzed as topic markers, in 
what they term “anti-cleft” constructions. 

Alternatively, it is not uncommon for equative copulas to develop into focus markers, 
typically through cleft constructions (Heine & Kuteva 2002: 95). One could speculate that 
an old Sumi equative copula was reanalyzed as a focus marker via a cleft construction, 
which has been extended to mark new and continuing topics. This pathway is attested 
- for instance, Ueno (1987) uses historical textual data to show that the Japanese topic 
marker wa originated as a contrastive marker ha used for “emphasis” before it developed 
the function of marking topic differentiation, and eventually topic continuity. 

In the case of Sumi, good historical data is not available and what has been presented 
here is still speculation. Furthermore, it is still unclear how -ye would have spread from 
equative clauses to intransitive ones. Perhaps, if the time depth for such grammatical- 
ization processes in Sumi is shallow, it might even be helpful to look at differences in 
the distribution of -ye in the speech of older vs. younger speakers or between villages 
which are said to speak more 'conservative' varieties of Sumi vs. other villages. 


D However, in other tenses, Sumi does use copulas derived from the verb shi ‘to do’. 
T One also wonders if the Sumi post-verbal emphatic suffix -e ~ -i, as seen below, is a reflex of an older copula. 


(i) Sumi (Tibeto-Burman; Kutili bird short, line 33) 
Pa-ye  khaghi=no o-pütsa-ni pi u-va-e. 
3sc-ye  long.ime-no  2sc-talkto-pros say  go-PRF-EMPH 


“She said a long time ago that she was going to see you and left’ 
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4.3 Origin of -no in transitive and intransitive clauses 


There is evidence pointing to an instrumental origin for the agentive =no, which then 
was extended to mark constrastive focus. However, positing a Sumi-specific origin for 
-no in transitive and intransitive clauses is somewhat problematic. The instrumental 
marker in Sumi is pesú, derived from a verb meaning ‘take’, but there is evidence of a 
rarer instrumental =no that is homophonous with the agentive =no, as in (39). This rarer 
-no is likely an older instrumental marker that is being replaced by a more recently 
innovated and morphologically transparent pesi. 


(39) Sumi(Tibeto-Burman; elicited) 
Pa-puh=no a-ngu=no a-chequ qhi-ve. 
3sG-father=n0 NRL-spear=no  NRL-porcupine  pierce-vM 


“His father impaled the porcupine with the spear: 


Syncretism between the agentive and instrumental (and sometimes the ablative) is 
widespread across languages (Garrett 1990) and found throughout Tibeto-Burman (De- 
Lancey 1984; LaPolla 1995; Noonan 2009). In the family, one finds numerous morphemes 
with the form nV (where V is a vowel) that have been glossed as ‘ergative’, 'instrumen- 
tal' or 'ablative'. Consequently, this makes it difficult to determine whether the agentive 
function of Sumi -no is inherited from an earlier proto-language, or if it is an example 
of parallel grammaticalization across languages of the family, as per LaPolla (1995). 

In terms of directionality, the development of ergative / agentive markers from instru- 
mental markers is well attested, e.g. Garrett (1990).? However, Coupe (2011) questions 
this particular pathway for the Ao languages (Tibeto-Burman), which often display syn- 
cretism between the agentive, instrumental and allative. Rather, he posits a proto-Ao “na 
which was a “semantically underspecified marker of location” and that it was pragmatic 
context that determined the "precise" semantic role it marked, such as agent, instrument, 
goal, source etc.1ó 

In addition, in many Tibeto-Burman languages, the agentive / ergative, like Sumi =no, 
does not simply mark agentivity, but has been extended to other functions, including 
discourse pragmatic functions like contrastiveness and unexpectedness. For example, in 
Lhasa Tibetan, the ergative marker -s/-gis on an argument in certain monovalent clauses 
can give a contrastive focus reading, i.e. 'this S, not someone else', when accompanied by 
the “proper intonation” (Tournadre 1991). In Mongsen Ao, the agentive na can be used to 
indicate willfulness, in addition to intentionality (Coupe 2007: 157). In terms of direction- 
ality, it has been demonstrated in some languages, the discourse pragmatic morphemes 
have developed from the semantic role markers (e.g. Chelliah 2009 for Meithei), follow- 
ing the expected path from more concrete to more abstract meaning (Heine & Kuteva 
2002, inter alia). 


Note that Garrett (1990) does not rule out the possibility that some instrumental markers may reflect older 
ergatives. 

léCoupe (2011) also shows that most synchronic ablatives in Ao languages are compounds of the locative + 
agentive / instrumental, and suggests that the original ablative in these languages was syncretic with the 
agentive and instrumental, as well as the allative. 
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However, once again given the presence of numerous potential nV cognates across 
the family, it is difficult to use cross-linguistic data to determine the extent to which the 
functions of =no in Sumi as both an agent marker and a focus marker is something that 
was inherited from an ancestor language, or is an example of parallel drift within the 
Tibeto-Burman family. It would perhaps be useful to look beyond the marking of A and 
S and examine morphological marking in other parts of the grammar. 


4.4 Morphological marking of adverbial adjuncts 


To further understand the historical development of =ye and -no, one area for further 
research is the marking of adverbial adjuncts in Sumi. Like S arguments, these adjuncts 
show a three-way opposition in morphological marking. In (40)-(42), there are examples 
of adjuncts marked by =ye, =no or by neither enclitic, respectively. These examples are 
important to consider, since they appear to have similar discourse pragmatic functions, 
e.g. contrastive focus, to what have been described for S argument markers. 


(40) Sumi (Tibeto-Burman; Kutili bird short, line 8) 
Ishi-ke-hu pa-ye  ghulo lakhi-ye, "Pr P 
like.this=NZR=DIST 3SG=ye day one=ye EXCL 


“So one day she thought to herself, “Oh ...” 


(41) Sumi (Tibeto-Burman; Kutili_bird_short, line 33) 
Pa-ye khaghi=no  o=pútsa-ni pi u-va-e. 
3sc-ye long.time=no  2sG=talk.to-PrROS say  go-PRF-EMPH 


“She said a long time ago that she was going to see you and left: 


(42) Sumi (Tibeto-Burman; Origin of axone, line 7) 
A-tsala | a-küthü-ni-u a-lu=lo ilesú hu-ghi=no 
NRL-day NRL-three-oRD-DEF NRtL-field=Loc return  go.field-come=CoNN 
“On the third day, they returned to the field” 


The question here is: did such marking on adjuncts arise prior to, parallel to, or even 
after differential A and S marking? For example, one might posit a locative function and 
origin for =ye in (40), but it cannot be assumed that its development followed the same 
diachronic pathway as =ye in pa=ye, in the same example. One also cannot easily posit 
an origin for =no in (41). 

If the development of differential A and S marking has been driven to some extent 
by information structure, it is important to understand how pragmatic discourse factors 
have influenced other aspects of the grammar, including cleft / cleft-like constructions 
and the marking of relative clauses. Such work would benefit from the use of experi- 
mental methods typically used to study the role of prosody in information structure, in- 
cluding questionnaires and other tasks designed to elicit semi-spontaneous speech (e.g. 
Skopeteas et al. 2006; Hellmuth et al. 2007). 
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5 Summary and further questions 


In this paper, I first looked at the distribution of A and S marking in Sumi, and showed 
that Sumi has a two-way contrast for A and a three-way contrast for S, but no morpho- 
logical marking of non-pronominal O. This is markedly different from closely related 
languages such as Mao and Khezha that show a two-way opposition for O, in addition 
to a two-way opposition on S and A arguments. 

Next, I examined some of the triggers of differential A and S marking in Sumi. It was 
shown that in transitive clauses, differential A marking is determined largely by the 
agentivity of the A argument, i.e. the degree of volition, control and purpose of the A ar- 
gument. In intransitive clauses, it was shown that differential S marking was determined 
mainly by discourse pragmatic functions such as continuing reference, contrastive focus, 
and the marking of unexpectedness. Furthermore, some of these functions seem to influ- 
ence differential A marking even in transitive clauses, although the extent to which this 
is the case remains a topic for further investigation. 

I then considered the origins of such differential markers in Sumi. It was hypothe- 
sized that =ye in transitive clauses developed from an old locative marker. It was further 
speculated that =ye in intransitive and equative clauses may have developed from an 
old copula." No clear Sumi-specific origin could be presented for the agentive / focus 
marker -no, given that cognates of =no are found throughout the Tibeto-Burman family 
- these typically function as agentives or ergatives, but also instrumentals and ablatives, 
and can have discourse pragmatic functions like marking contrastive focus. 

There are still many questions to be answered regarding the distribution of =ye and 
-no in Sumi, as well as their diachronic origins. Future research will also need to look 
at the morphological marking of adjuncts and relative clauses. Such work would benefit 
from corpus studies based on naturalistic data, as well as the use of experimental tasks 
designed to elicit and identify information structure categories. 
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To account for the same form =ye used in transitive, intransitive and equative clauses, one might have 
to further speculate that the old locative marker and equative copula both derive from an older locative 
copula. 
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Abbreviations 
1 1* person NEG negative 
2 ond person NOM nominative 
3 gid person NPST  non-past tense 
CONJ conjunction NRL non-relational (unpossessed noun) 
CONN connective NZP nominalizing prefix 
CONT  continuative aspect NZR è nominalizing enclitic 
DEF definite ORD ordinal number 
DIST distal PART particle 
DU dual PL plural 
EMPH emphatic PRF perfect aspect 
ERG ergative PRX proximal 
EXCL exclamation PST past tense 
EXIST existential verb PROG progressive aspect 
HAB habitual aspect PROS prospective aspect 
LOC locative Q question particle 
MED medial QUOT  quotative 
NCAP  non-capability SG singular 
NCPL  non-completive VM verbal marker 
References 


Aissen, Judith. 2003. Differential object marking: Iconicity vs. Economy. Natural Lan- 
guage and Linguistic Theory 21(3). 435—483. 

Andvik, Erik. 2010. A grammar of Tshangla. Leiden: Brill. 

Bossong, Georg. 1983. Animacy and markedness in universal grammar. Glossologia 2(3). 
7-20. 

Bossong, Georg. 1985. Differentielle Objektmarkierung in den neuiranischen Sprachen. Tú- 
bingen: Narr. 

Burling, Robbins. 2003. The Tibeto-Burman languages of Northeastern India. In Graham 
Thurgood 8 Randy LaPolla (eds.), The Sino-Tibetan languages, 169-191. London: Rout- 
ledge. 

Chelliah, Shobhana L. 1997. A grammar of Meithei. Berlin: Mouton de Gruyter. 

Chelliah, Shobhana L. 2009. Semantic role to new information in Meithei. In Jóhanna 
Baródal & Shobhana L. Chelliah (eds.), The role of semantic, pragmatic, and discourse 
factors in the development of case, 377-400. Amsterdam: John Benjamins. 

Chelliah, Shobhana L. & Gwendolyn Hyslop. 2012. Introduction to special issue on op- 
tional case marking in Tibeto-Burman. Linguistics of the Tibeto-Burman Area 34(2). 1- 
7. 

Clark, Eve V. 1978. Locationals: Existential, locative, and possessive constructions. In 
Joseph H. Greenberg, Charles A. Ferguson & Edith A. Moravcsik (eds.), Universals of 
human language, vol. 4: Syntax, 85-126. Stanford: Stanford University Press. 


398 


13 Differential A and S marking in Sumi (Naga) 


Coupe, Alexander R. 2007. A grammar of Mongsen Ao. Berlin: Mouton de Gruyter. 

Coupe, Alexander R. 2011. Pragmatic foundations of transitivity in Ao. Studies in Lan- 
guage 35(3). 492-522. 

Dalrymple, Mary & Irina Nikolaeva. 2011. Objects and information structure. Cambridge: 
Cambridge University Press. 

de Hoop, Helen & Peter de Swart (eds.). 2008. Differential subject marking. Dordrecht: 
Springer. 

DeLancey, Scott. 1984. Etymological notes on Tibeto-Burman case particles. Linguistics 
of the Tibeto-Burman Area 8(1). 59-77. 

DeLancey, Scott. 2011. Optional 'ergativity' in Tibeto-Burman languages. Linguistics of 
the Tibeto-Burman Area 34(2). 9-20. 

Dixon, R. M. W. 1994. Ergativity. Cambridge: Cambridge University Press. 

Fauconnier, Stefanie. 2011. Differential agent marking and animacy. Lingua 121(3). 533- 
547. 

Garrett, Andrew. 1990. The origin of NP split ergativity. Language 66(2). 261-296. 

Giridhar, Puttushetra Puttuswamy. 1994. Mao Naga grammar. Mysore: Central Institute 
of Indian Languages. 

Harris, Alice C. & Lyle Campbell. 1995. Historical syntax in cross-linguistic perspective. 
Cambridge: Cambridge University Press. 

Heine, Bernd & Tania Kuteva. 2002. World lexicon of grammaticalization. Cambridge: 
Cambridge University Press. 

Hellmuth, Sam, Frank Kügler & Ruth Singer. 2007. Intonational patterns, tonal align- 
ment and focus in Mawng. In Proceedings of 16th ICPhS satellite workshop Intona- 
tional phonology: Understudied or fieldwork languages, Saarbruecken, Germany, Aus- 
gust 5th, 2007. Institute of Phonetics. http://www.linguistics.ucla.edu/people/jun/ 
Workshop2007ICPhS/Papers/Sam-mawng1.0.pdf. 

Hopper, Paul J. & Sandra A. Thompson. 1980. Transitivity in grammar and discourse. 
Language 56(2). 251-299. 

Iemmolo, Giorgio € Gerson Klumpp. 2014. Introduction to the special issues "Differential 
Object Marking: Theoretical and empirical issue”. Linguistics 52(2). 271-279. 

Kapfo, Kedutso. 2005. The ethnology of the Khezhas and the Kheza grammar. Mysore: 
Central Institute of Indian languages. 

LaPolla, Randy J. 1995. ‘Ergative’ marking in Tibeto-Burman. In Yoshio Nishi, James Ma- 
tisoff & Yasuhiko Nagano (eds.), New horizons in Tibeto-Burman morphosyntax, 189- 
228. Osaka: National Museum of Ethnology. 

Lewis, Paul M., Gary F. Simons & Charles D. Fennig. 2013. Ethnologue: Languages of the 
world. 17th edn. Dallas: SIL International. http://www.ethnologue.com. 

Malchukov, Andrej L. 2008. Animacy and asymmetries in differential case marking. Lin- 
gua 118(2). 203-221. 

McGregor, William B. 2010. Optional ergative case marking systems in a typological- 
semiotic perspective. Lingua 120(7). 1610-1636. 

McGregor, William B. & Jean-Christophe Verstraete. 2010. Optional ergative marking 
and its implications for linguistic theory. Lingua 120(7). 1607-1609. 


399 


Amos Teo 


Noonan, Michael. 2009. Patterns of development, patterns of syncretism of relational 
morphology in the Bodic languages. In Jóhanna Baródal & Shobhana L. Chelliah (eds.), 
The role of semantic, pragmatic, and discourse factors in the development of case, 261- 
282. Amsterdam: John Benjamins. 

Payne, Thomas E. 1997. Describing Morphosyntax: A guide for field linguists. Cambridge: 
Cambridge University Press. 

Silverstein, Michael. 1976. Hierarchy of features and ergativity. In R. M. W. Dixon (ed.), 
Grammatical categories in Australian languages, 112-171. Atlantic Highlands, NJ: Hu- 
manities Press. 

Skopeteas, Stavros, Ines Fiedler, Samantha Hellmuth, Anne Schwarz, Ruben Stoel, Gis- 
bert Fanselow, Caroline Féry & Manfred Krifka. 2006. Questionnaire on information 
structure (QUIS): Reference manual (Interdisciplinary Studies on Information Structure 
4). Potsdam: Universitátsverlag Potsdam. 

Stassen, Leon. 2013. Predicative possession. In Matthew S. Dryer & Martin Haspelmath 
(eds.), The world atlas of language structures online. Leipzig: Max Planck Institute for 
Evolutionary Anthropology. http://wals.info/chapter/117. 

Teo, Amos. 2012. Sumi agentive and topic markers: No and ye. Linguistics of the Tibeto- 
Burman Area 35(1). 49—74. 

Tournadre, Nicolas. 1991. The rhetorical use of the Tibetan ergative. Linguistics of the 
Tibeto-Burma Area 14(1). 93-108. 

Ueno, Noriko Fujii. 1987. Functions of the theme marker wa from synchronic and di- 
achronic perspectives. In John Hinds, Shoichi Iwasaki & Senko K. Maynard (eds.), Per- 
spectives on topicalization: The case of Japanese WA, 221-263. Amsterdam: John Ben- 
jamins. 

Verma, Manindra K. & Karavannur Puthanvettil Mohanan. 1990. Experiencer subjects in 
South Asian languages. Stanford: Stanford Linguistic Association. 


400 


Chapter 14 


Differential subject marking and its 
demise in the history of Japanese 


Yuko Yanagida 


University of Tsukuba 


The subject of various types of subordinate or nominalized clauses in Old Japanese (700- 
800) is marked in one of three different ways: with the postpositional particle ga, no or 
zero. This paper argues that the opposition between case marked and unmarked subjects fit 
into cross-linguistically well attested patterns of differential subject marking (DSM). Follow- 
ing Woolford (2008), it shows that the syntactic and semantic characteristics of these case 
marking patterns reveal thatOJ displays two kinds of DSM effects which are associated with 
distinct grammatical levels. This paper also examines three possible scenarios for the loss 
of DSM, which occurred in Early Middle Japanese (EMJ 800-1200). TheOJ and EMJ data 
suggest that case systems do not simply shift from one alignment pattern to another, as 
sometimes assumed (cf. Harris & Campbell 1995: 258). Instead, the morphological features 
of individual case markers change incrementally over time, ultimately giving rise to global 
changes in the overall system. 


1 Introduction 


Modern Japanese (Mod)J) displays a straightforward nominative-accusative system. Tran- 
sitivity does not affect the case marking on the subject (1). 


(1) Modern Japanese 


a. Taroo ga sake o non-da koto (transitive) 
Taroo NOM sake Acc drink-pst that 


‘that Taroo drank sake’ 


b. sakura ga sai-ta koto (intransitive) 
cherry.blossom Nom bloom-pst that 


‘that Cherry blossoms bloomed’ 


In ModJ the case markers ga and o mark the subject and object respectively as gram- 
matical case markers; these particles display no semantic effects. 


Yuko Yanagida. Differential subject marking and its demise in the history of 
Japanese. In Ilja A. Serzant & Alena Witzlack-Makarevich (eds.), Diachrony 
| of differential argument marking, 363-382. Berlin: Language Science Press. 
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In Old Japanese (OJ; 8th century), ga is a genitive case marker. Ga marks the posses- 
sor of noun phrases (2) and the subject of various types of subordinate or nominalized 
clauses (3). Personal pronouns and human nouns intimate to the speaker as in seko lover” 
and kimi ‘lord’ are obligatorily marked by ga, while non-human animate and inanimate 
NPs are predominantly marked by the other genitive no or by zero.! 


(2) Old Japanese (MYS 4303; MYS 4191) 


a. [wa ga  sekwo ga ` yadwo] 
I GEN lover GEN house 


“my lover's house” 
b. [ayu no si ga pata] 
sweetfish GEN it GEN fin 


“sweetfish's fins’ 


(3) Old Japanese (MYS 2926; MYS 3837; MYS 925) 


a. [wa ga  sekwo ga  motomu-ru] omo ni ika-masi mono wo 
I GEN lord AGT ask-ADN nurse DAT go-AUX thing EXCL 


‘I would go as the wet nurse that my lord asks for. 


b. [mizu no tama ni nita-ru] mimu 
water GEN pearl DAT resemble-ADN see 


‘(I) see water resembles a pearl 
c. [pisaki Oe  opu-ru] kiyoki kapara-ni 
catalpa grow-ADN clear  riverbank-on 


“on the banks of the clear river where catalpas grow” 


A number of researchers argue that adnominal verb ending -ru (with a different set 
of endings on adjectives and auxiliaries) as in (3a-3c) had nominalizing functions (see 
Miyagawa 1989; Yanagida & Whitman 2009; Robbeets 2015).? The subject of a nominal- 
ized verb is marked in one of three ways. The semantic difference between ga and no 
has been treated in the literature (cf. Ohno 1977; Nomura 1993), but bare subjects as 
in (3c) have not been integrated into this discussion; they are generally set aside as in- 
stances of stylistic case drop. Below I show that the alternation between case-marked and 
unmarked arguments inOJ fits into cross-linguistically attested patterns of differential 
subject marking (DSM). Under this approach, unmarked arguments cannot be viewed as 
mere stylistic case drop, but they have both syntactic and semantic significance. 


10] data in this study are taken from the Man'yoshü (MYS, compiled in mid-8th century), the earliest written 
record of OJ, comprising 4516 long (choka) and short (tanka) poems. The data is taken from electronic text 
“Man yoshú Search System” (Yamaguchi University, Japan) as well as the Oxford Corpus of Old Japanese 
(University of Oxford). For periodization, I follow Frellesvig (2010). Old Japanese (abbreviated ‘OJ; approx- 
imately 700-800), Early Middle Japanese ((EMJ 800-1200), Late Middle Japanese ((LMJ 1200-1600), Early 
Modern Japanese (“EModJ’ 1600-1800). 

?Robbeets (2015) suggests that the adnominal form -ru has undergone a grammaticalization from deverbal 
noun suffix to clausal nominalizer to relativizer and, finally, to finite form. 
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The paper is organized as follows. 82 briefly discusses the general approach to DSM 
which I adopt: DSM is realized through the interaction of three distinct levels: (i) argu- 
ment structure, (ii) syntax and (iii) PF (morphological spell-out), as proposed by Wool- 
ford (2008). In 83, I argue that ga and no, — each functioning in opposition to the zero 
form - are associated with different levels of DSM: ga is a morphological realization of ac- 
tive case assigned to an external argument within the vP phase. It follows independently 
motivated PF constraints relatable to Silverstein's (1976) nominal hierarchy. Genitive no 
is assigned to any NPs in the CP phase, where they receive specific interpretations. $4 
examines three possible scenarios for the loss of DSM, which occurred in Early Middle 
Japanese (EMJ; 800-1200). I argue that the development of nominative ga results from 
the reanalysis of psych transitive predicates as intransitive taking a single theme argu- 
ment. The present study suggests that the loss of DSM cannot be interpreted as a simple, 
one-step shift in alignment or case marking, as such changes are sometimes presented in 
work on diachronic syntax (cf. Harris & Campbell 1995). Instead, the morphological fea- 
tures of individual case markers change incrementally over time, only after time giving 
rise to global changes in the overall system. 


2 Differential Subject Marking (DSM) 


I assume with Woolford (2008) that DSM effects are associated with three distinct gram- 
matical levels. The first level of DSM is closely linked to 0 role assignment (canoni- 
cally, Agent) to subjects, and to contexts where inherent (or non-structural) Case is 
assigned to external arguments. This level of DSM is identified as the argument struc- 
ture (or vP phase), which corresponds to the representational level of D-structure in the 
government-binding theory of Chomsky (1981). The second level of DSM is associated 
with syntax above vP. It behaves in parallel to differential object marking (DOM) in that 
case alternation depends on the syntactic position of the subject: often, subject or ob- 
ject arguments which move outside vP are morphologically marked (by an affix or by 
triggering agreement) and assigned language particular interpretative properties, such 
as specificity, definiteness, animacy, etc. (cf. Diesing 1992, Chomsky 2001). The third 
level of DSM involves post-syntactic PF constraints; this is the level at which abstract 
case features are spelled out morphologically. According to Woolford (2008), DSM at 
this level involves markedness, which she defines in relation to Silverstein’s (1976) 1976 
nominal hierarchy. Cases at the more marked end of the hierarchy are more likely to be 
morphologically marked. 

In both the typological and theoretical literature, active alignment is often classified as 
a subtype of ergative (cf. Comrie 1973; 1978; Silverstein 1976; Bittner & Hale 1996). Active, 
however, differs crucially from ergative alignment in that transitivity plays no role. In 
Hindi, for example, the case marker -ne appears on the agent subject of both transitive 
(4a) and unergative intransitive verbs (4b), while the theme subject of unaccusatives (4c) 
is unmarked: 
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(4) Hindi (Indo-Aryan; Mohanan 1994: 71, 107) 

a. raam-ne  lakdii kaatii 
Ram-ERG  wood.NOM  cut.PERF.F 
“Ram cut wood? 

b. raam-ne | nahaayaa 
Ram-EncG  bath.PERF 
‘Ram bathed. 

c. raam (*-ne)  giraa 
Ram (*-ERG) fall.PERF 
‘Ram fell. 


According to Woolford (1997; 2008), DSM effects in Hindi are determined at argument 
structure. The external argument (AGENT) is 0-marked and at the same time inherently 
case-assigned by v in a vP projection above VP, as represented in (5). 


(5) DSM at argument structure 
vP 


IN 


external v 


argument ras 


v VP 


[+Agt] ZN 


The analysis of ergative (or active) as inherent case assigned to the external argument 
in the specifier position of vP originates with Woolford (1997) and is shared by many 
researchers such as Legate (2002; 2008); Aldridge (2004; 2008) and Anand & Nevins 
(2006). I maintain that while ergative is assigned to the external argument in the specifier 
position of [+transitive] v, active is assigned to the external argument in the specifier of 
[+Agent]v (Yanagida & Whitman 2009). 


> 


3The descriptive generalization that supports the view that ergative is an inherent case comes from the fact 
that ergative subjects in some instances occur in non-finite clauses while structural nominative subjects 
cannot (cf. Legate 2002; Aldridge 2004). Derived subjects are never ergative; that is, no language promotes 
objects to ergative through operations such as raising or passive. A reviewer points out that this fact may 
have a functional explanation, but the structural consequence remains the same: ergatives are assigned 
inherent case. 
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3 Two Types of DSM in OJ 


3.1 DSM: ga vs. zero 


Yanagida (2007) and Yanagida & Whitman (2009) argue that while inOJ main declarative 
clauses have a nominative-accusative pattern: the subjects of both transitive and intran- 
sitive verbs are morphologically unmarked. Various types of embedded or nominalized 
clauses, exemplified by the adnominal clauses (3) and (6), show active alignment.* 


(6) Adnominal clauses: Old Japanese (MYS 868; MYS 3443; MYS 925) 

a. [Saywopimye no kwo ga pire Ø  puricsi] yama 
Sayohime GEN child Aer scarf Wave-PST.ADN mountain 
‘the mountain where the child Sayohime waved her scarf’ 

b. [wa ga yuku] miti ni 
IP AGT gO.ADN road Loc 
*...on the road I travel? 

c. [pisakwi Ø  opu-ru] kiywoki kapara 
catalpa grow-ADN clear riverbank 


“the banks of the clear river where catalpas grow” 


As we see in (6), the subjects of intransitive verbs display two distinct patterns; the 
agent subjects of the transitive and active intransitive verbs (6a)-(6b) are marked by ga, 
but the patient subject of the inactive intransitive (6c) is morphologically unmarked in 
the same way as the transitive object in (6a). 

OJ behaves in parallel to Hindi in that morphological case appears on agent subjects, 
but theme subjects of unaccusatives are zero marked. OJ, however, differs crucially from 
Hindi in that it displays a nominal-based split. Nominal based split ergative languages 
show an ergative pattern with some NPs, and a nominative pattern with others. This in- 
teracts with Silverstein's (1976) nominal hierarchy (7). Silverstein's nominal hierarchy, as 
is well known, references the feature specification of noun phrases and makes crucial use 
of featural markedness. Pronouns are specified for [person (+ego, 1P)/(+tu, 2P)], [+¿num- 
ber], [gender], etc. Noun phrases are specified for [+proper] [+human][+animacy] etc. 


(7) The Nominal Hierarchy (Silverstein 1976) 
pronouns > proper nouns > human > animate > inanimate 
1st >2nd >3rd person 


Nominative in a nominative-accusative system and absolutive in an ergative-absolu- 
tive system are unmarked (in terms of MARKEDNESS), typically phonologically zero. The 


^Main declarative clauses and embedded clauses selected by the cognitive/speech verb such as ip- “say” or 
omop- ‘think’, appear with the verb in the shüsikei ‘conclusive form’ V-u, with a different set of endings 
on adjectives and auxiliaries. In conclusive clauses, both subject and object are morphologically unmarked. 
The subject is never marked by no or ga. 
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accusative in the one system and ergative in the other are marked. Silverstein observes 
that “if the noun phrases of a language have accusative case-marking at a certain plus- 
value of a feature [Fi], and ergative case-marking for [-Fi], then noun phrases are ac- 
cusative for all features above [Fi] in the hierarchy and ergative for all feature below 
[Fi] in the hierarchy" (Silverstein 1976: 123). Dixon (1979: 86-87) interprets the hierarchy 
to "roughly indicate the overall agency potential of any given NP", and observes that a 
number of languages have split case marking exactly on this principle. 

Woolford (2008), whom I follow in the discussion below, argues that MARKEDNESS as 
expressed in Silverstein's nominal hierarchy is a PF constraint (to be exact, a constraint 
on morphological spell-out). PF is the level where “decisions are made concerning the 
overt realization of (abstract) features from syntax" (Woolford 2008: 29). On this view, 
nominals lower on the hierarchy are atypical subjects; thus they are marked ergative at 
PF, while those higher on the hierarchy are atypical objects, and thus they are marked 
accusative. Nominals that realize typical subject and object grammatical functions are 
unmarked morphologically. In other words, ergative case is assigned to all transitive 
subjects, but in nominal based split ergative languages, the more marked subjects are 
those that lie lower on the hierarchy. Accusative, on the other hand, is the mirror image 
of ergative. The more marked categories for the object are those that lie higher in the 
hierarchy. 

A split based on the nominal hierarchy is also typical of active alignment, but cru- 
cially, the nominal hierarchy applies to the argument NPs in the opposite direction as 
first suggested by Dahlstrom (1983). As Mithun (1991) points out, case markers based on 
agency are frequently restricted to nominals referring to human beings. Mithun identi- 
fies the semantic basis of the active marking of various non-accusative languages, both 
synchronically and diachronically. The active system in Batsbi (Tsova-Tush) is limited 
to first and second persons. Central Pomo has an active system in nominals referring 
to humans only. The Georgian active system is restricted to human beings. The Yuki 
system is restricted to animates. From these cross-linguistic observations, the implica- 
tion follows that active marking is exactly the opposite of the right-to-left application of 
the hierarchy proposed by Silverstein for ergative languages. The relationship between 
active marking and the nominal hierarchy is as stated in (8) (cf. Yanagida & Whitman 
2009): 


(8) The active marking hierarchy (AMH) 
In active languages, if active marking applies to an NP type «, it applies to every 
NP type to the left of « on the nominal hierarchy. 


Assignment of active case is dependent not just on the thematic role assigned by the 
verb, but on the place of S on the nominal hierarchy. Klimov (1974; 1977) emphasizes 
this point, stressing that in active languages both the semantics of the predicate and the 
subject NP govern the distribution of active case. 

InOJ the active marking appears when the S argument has control over the activity 
and the inactive pattern appears when control is lacking. Consider (9)-(10): 
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(9) Old Japanese (MYS 3724; MYS 177; MYS 2991) 
a. [kimi ga . yuk-u] miti no  nagate 
Lord AGT go-ADN road GEN length 
“the length of the road my lord travels” 


b. [wa ga naku] namita 
1P AGT cry.ADN tear 


‘the tears that I cry’ 
c. [papa ga  kap-u] kwo 
mother AGT breed-apn silkworm 


“the silkworms bred by my mother” 


(10) Old Japanese (MYS 2713; MYS 3352; MYS 4318) 
a. [asuka-gapa Ø yuku] se wo  paya-mi 
Asuka-river go.ADN shallows OB] fast-CONJ 
“since the shallows where the Asuka River flows are fast 


b. [pototogisu Ø naku] kope 
cuckoo (AGT) cry.ADN call 


‘the call of the cuckoo crying’ 


c. [aki no nwo ni  tuyu Ø open pagwi] ‘wo 
fall GEN field roc dew COVer-ADN  bush.clover OBJ 
ta-wora-zu 


hand-break-not 


“without breaking off the dew-laden bush clover in the fall meadow’ 


The verbs yuku ‘go’ and naku ‘cry’ are classified as active, more specifically, unergative 
verbs, and hence the subject NPs are case assigned by v[+Agent] (see (5) above), but 
whether the subject NP is morphologically realized depends on the semantic features 
of the nominals. The use of ga is obligatory for personal pronouns such as wa T and 
kimi *you/lord”. The human NPs higher on the hierarchy are associated with prototypical 
agents, which express volition and control, whereas the non-human or inanimate NPs 
lower on the hierarchy do not correspond to the transitivity prototype. This correlates 
with the fact that transitive subjects are marked by ga, but never marked by zero in 
embedded nominalized clauses in OJ. 

The most crucial syntactic property of transitive clauses inOJ is that wo-marked ob- 
jects necessarily move over the ga-marked subject, resulting in OSV word order (11). 
When objects are unmarked, they have canonical SOV word order (12) (Yanagida 2006; 
Yanagida & Whitman 2009). Wo-marked objects are specific, while zero marked objects 
are non-specific.” 


"ln Yanagida & Whitman (2009) and Frellesvig et al. (2015; 2018) we argue thatOJ displays DOM effects 
associated with specificity (cf. Aissen 2003). 
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(11) Old Japanese (MYS 3669; MYS 3960; MYS 3459) 
[Object wo Subject ga V]: 
a. ware wo yami ni ya imo ga  kwopi-tutu aru ram-u? 
I OBJ] dark roc Q wife AGT long.for-coNT be  AUX-ADN 
"Would my wife be longing for me in the dark? 


b. kimi wo aga mat-an-akuni 
lord OB] LAGT  wait-not-NMLZ 


‘without me waiting for you’ 


c. aga te wo tono no  wakugwo ga  torite nageka-mu 
my hand opt lord GEen child AGT take | weep-AUX.ADN 


"Will my lord's child take my hand and weep again tonight?' 


(12) Old Japanese (MYS 868; MYS 3351) 


[Subject ga Object O V]: 
a. Saywopimye no kwo ga pire Ø  puricsi yama 
Sayohime GEN child aer scarf wave-PST.ADN mount 


‘the mountain where the child Sayohime waved a scarf/did scarf-waving’ 


b. kanasiki | kwo-ro ga ` ninwo Ø  pos-aru kamo 
sad child-pim act cloth hang.out-ADN Q 


"Ihe sad child has hung out a piece of cloth: 


Given our assumption that ergative/active is assigned by v in a vP projection (5), the 
accusative is not licensed inside vP; the OSV dominant word order is derived by move- 
ment of the object to the left peripheral topic position; namely, the specifier of CP, as 
represented in (13). 

As discussed extensively in Yanagida & Whitman (2009), when the subject is marked 
by ga, the objects that follow the subject are without exception non-branching noun 
heads, as in pire ‘scarf’ and ninwo ‘cloth’ (12a)- (12b). These noun heads are syntactically 
incorporated into the verb. Noun incorporation, which is widely observed in ergative 
languages, is a detransitiving process on a par with antipassives, in that both involve 
a shift in valency, creating a derived intransitive (see Baker 1988). In other words, the 
transitive verbs with the object in (12) pattern like unergative intransitives; the subject 
is marked by ga, but the incorporated object is not assigned structural accusative case 


by the verb. 


éModJ does not have noun incorporation in a strict sense. Noun incorporation discussed by Kageyama (1980) 
such as kosi o kakeru “sit a seat’ vs. kosi-kakeru, tema o toru ‘take time’ vs. tema-doru are not productive. 
These expressions are possibly analyzable as lexical compounds. 
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(13) CP 


AON DSM at Argument Structure 


In this section, I have proposed that the alternation between ga and zero, as illustrated 
in Table 1,’ arises within a smaller domain of a nominalized clause, namely vP Ek 


Table 1: DSM ga vs. zero in OJ 


Active Inactive 


Subject ga © 
Object 19) 


The external argument is assigned active case by v¡,Ag1], in the same way as Hindi. OJ, 
however, displays Woolford's (2008) third level of DSM effects. The actual exponence or 
marking of the feature [+Agent] is independently determined by language particular PF 
constraints, relatable to Silverstein's (1976) nominal hierarchy. Subject NPs higher on 
the nominal hierarchy appear with active predicates, and NPs lower on the hierarchy 
appear with inactive predicates.? 


7 As noted above, active marking is sensitive not only to the semantics of NPs but also to the semantics of 
predicates. The subjects of transitive verbs and active intransitive verbs are necessarily marked by ga (or 
no), but never by zero. (See 83.3 for no.) 

Din 83.3, we discuss the other type of DSM which arises in a higher domain of nominalized clauses; namely 
CP phase. 

?Klimov (1977: 95-96) discusses a similar correlation between subject NPs and their predicates in active 
languages. 
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3.2 Experiencer Predicates 


Ergative (or active) languages often mark the subject of an experiencer verb with ergative 
(or active) case, treating them like an external argument. This is illustrated by Basque and 
Hindi, respectively in (14)-(15). 


(14) Basque (isolate; Woolford 2008: 24) 
Mikel-ek ni haserretu nau. 
Michael-ERG 1sG.NOM  angry.PERF AUX 


“Michael angered me: 


(15) Hindi (Indo-Aryan; Mohanan 1994: 142) 
tusaar-ne vah  kahaanii yaad kii 
Tushar-ERG that storyNoM  memory.NOM  do.PERF 


"Tushar remembered that story: 


In Basque, the theme argument is marked by ergative case (14), while in Hindi, the 
experiencer is marked by ergative case (15). 

Kikuta (2012) points out thatOJ ga appears on the non-agentive theme subject of ex- 
periencer verbs, such as wasur- ‘forget’ omop- ‘think’, mi- ‘see’ etc., and that this raises a 
problem for Yanagida & Whitman's (2009) hypothesis that ga is an active case. However, 
all of Kikuta's examples of these psych verbs with ga-marked theme subjects appear with 
an unspecified first person experiencer and a form of the auxiliary yu (stem ye-), which 
derives middles, passives, and potentials.? 


(16) Old Japanese (MYS 4407; MYS 3191) 


a. 


b. 


imo ga ` kopisiku  wasura-ye-nu-kamo 
my.lover AGT miss forget-MID-NEG.ADN-Q 


‘Did I miss my dear and cannot forget her?’ 
yama kopeni-si, kimi ga = omop-yu-raku-ni 
mountain cross-pst you/lord AGT  think-MID-NMLZ-LOC 


when you came to my mind as I was crossing the mountains 


-Yu is arguably related to the acquisitive light verb u (stem E-) ‘get’, which Whitman 
(2008) proposes as the source of the well-known transitivity alterations in -e- inOJ and 
later stages of the language. -E derives both transitives and intransitives, a property of 


10The productive passive auxiliary -yu inOJ appears after the irrealis (mizenkei) a-stem of the verb, as in (16a). 
With a small number of verbs such as omopoyu in (16b) -yu appears after a different stem vowel, probably 
reflecting an older fossilized pattern. The reviewer pointed out to me that current linguistic scholarship 
(cf. Whitman 2008; Frellesvig 2010; Robbeets 2015) has mostly agreed with Ohno (1953) that the a-stem of 
consonant verbs is nothing but a surface stem that diachronically reflects re-segmentation of suffixes in 
initial *a-. With a polysyllabic vowel final stem followed by a polysyllabic vowel initial suffix, we would 
expect the first vowel to drop, thus *omop-ayu. However, the productive medialOJ -(a)yu may have been 
derived from the copula *a- ‘to be’ followed by the original causative/medial *-yu (Robbeets 2015). Adding 
omopo- and -yu would give the expected result. 
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acquisitives such as English auxiliary get. If this analysis is correct, experiencer middles 
such as (16) may have an original transitive source, i.e. ‘my dear got me to forget her”, ‘you 
got me to think of him”. That is, (16) can be analyzed as a causative middle construction; 
the theme subject serves as the causer argument of the verb +yu. A parallel construction 
can be seen, for example, in Assamese, cited by Woolford (2008), where the theme subject 
of an experiencer verb is marked ergative when the light verb make/do is added to the 
verb: 


(17) Assamese (Indo-Aryan; Woolford 2008) 
a. gan-tu-e xap-tu-k khogal  korile 
song-class-ERG  snake-CLASS-DAT anger  made/did 


"Ihe song angered the snake: 


b. boroxun-e Ram-ok xant  korile 
rain-ERG  Ram-DAT calm  made/did 


“The rain calmed Ram’ 


The subject is the external argument of the light verb korile ‘make/do’ and is assigned 
ergative in Assamese. Facts like these show that languages may differ as to which ar- 
gument is mapped to the external argument position. The agent subject is invariably an 
external argument, but in some languages the causer argument of a psych-verb can be 
an external argument, and thus an agent, marked with ergative. 

In OJ, there are also some instances in which ga marks clausal complements of psy- 
chological adjectives (or experiencer adjectives) that end with si, such as po-si ‘want’ or 
kana-si ‘sad’ (-si may be historically related to the verb si do”), as shown in (18). Impor- 
tantly, these clausal complements are always marked by ga but never marked by no or 


Zero. 


(18) Old Japanese (MYS 4338; MYS 1007) 


a. [papa wo  panarete yuku] | ga _ kana-si sa 
mother OBJ] part go.ADN AGT sad-do  NMLZ 


“Tam sad about parting from mother. 


b. [tada  pitorigo ni aru] ga kuru-si sa 
only one.child par be.ADN AGT painful-do NMLZ 


“Tam pained that I am the only child... 


Although the two types of ga - the genitive ga and ga marking the clausal complement 
of psych adjectives — have been widely recognized, the historical relation between the 
two has not been examined. In (18a3)- (18b) the theme argument of psych verbs appears 
in external argument position marked by ga, whereas an unspecified (or implicit) experi- 
encer is an internal argument identified as first person singular (i.e. the speaker). (16a)- 
(16b) are apparently related to (18a)- (18b) in that they originate from a psych-transitive 
predicate with an unspecified first person experiencer object. Thus, (18a) literally means 
that ‘parting from my mother made me sad’, as represented in (19). 
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(19) [... V.aDN] ga [vp pro [+156] [Ap-..] si ‘do’ ] 


The clausal subject in (18), as in the case of (16), serves as the causer, thus agentive, 
of the matrix predicate po-si ‘do-wanting’, kana-si “doing sad’. Below in $4, I will argue 
that after OJ, this psych transitive construction was reanalyzed as intransitive, taking a 
single theme argument; this was the historical source of nominative ga. 


3.3 DSM in syntax 


In 83.1, I show that DSM effects identified at the argument structure within vP constitute 
semantically motivated case alternations between ga and zero. In this section, we discuss 
the other type of DSM associated with the alternation between no and zero. The latter 
type of DSM occurs when the subject NP is located in the position lower on the nom- 
inal hierarchy. A primary question to be addressed is: What is the difference between 
no-marked NP and zero-marked NPs, given that both appear on the nominals whose se- 
mantic features are lower in the hierarchy? Examples (20a)-(20b) indicate thatOJ has 
DSM associated with a specific/non-specific distinction on a par with DSM in Turkish 
and other languages with genitive subjects in nominalized clauses: 


(20) Old Japanese (MYS 4066) 


a. [u no pana no saku]  tukwi  tati-nu 
deutzia GEN flower GEN bloom month  pass-PERF 


“it was the month when the deutzia flower blooms’ 


b. [okitu mo no pana Ø saki-tara]-ba ware ni tuge koso 
offing seaweed GEN flower bloom-PERF-if I DAT tel roc 


“If seaweed flowers were to bloom in the offing, tell me. (But they would not 
bloom.) 


In (20a) the author composes the song at the sight of the deutzia flower in the garden 
where the banquet was held, thus referring to a specific entity. In (20b), on the other hand, 
the flower in the subjunctive conditional ba 'if'-clause is unambiguously non-specific, 
since it is not at the sight of the author, nor previously mentioned in the preceding 
context. 

In Turkish, as is well known, subjects of subordinate clauses marked by genitive are 
always specific, but when the subordinate subject is nominative, that is, zero-marked, its 
referent is interpreted as non-specific. Woolford (2008) argues that DSM in Turkish is 
determined at the level of syntax. Consider (21a)-(21c). 


(21) Turkish (Turkic; Kornfilt 2003) 
a. [(bir)ari-nin bugün  cocug-u sok-tug-un]-u duy-du-m 
bee-GEN today  child-Acc  sting-r.NoM-3sG-acc _ hear-PsT-1sG 
‘I heard that the bee/a bee (+specific) stung the child today. 
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b. [cocug-u bugün  (birjari  sok-tug-un]-u duy-du-m 
child-acc today bee sting-F.NOM-35G-ACC _ hear-PsT-1sG 
‘I heard that today bees/a bee [-specific] stung the child’ 

c. "[(bir)ari Ø  cocug-u bugün  sok-tug-un]-u duy-du-m 
bee child-acc today  sting-r.NoM-3sG-AcC  hear-PsT-1sc 


‘I heard that today bees/a bee [-specific] stung the child’ 


As originally observed by Kornfilt (2003; 2008), genitive subjects move outside vP, 
thus, appearing before the object (21a). Unmarked nominative subjects in subordination 
must appear adjacent to the verb, resulting in OSV order (21b)-(21c).OJ no-marked vs. 
zero marked subjects behave exactly like Turkish, as evidenced by (22a)-(22b). 


(22) Old Japanese (MYS 3689; MYS 2665) 


a. ipe pito no  idura-to ware wo  topa-ba ikani ipa-mu 
home someone GEN where-that I OBJ ask-if how  say-AUX 


"How should (I) say if someone in your family asks me where (you) are? 


b. waga kosi wo pito Ø  mike-mu kamo 
ĪP.AGT coming OBJ someone see-FUT.ADN Q 


“Would someone see me coming?” 


In (22a), the no-marked subject pito ‘person’ has a SPECIFIC reading; it picks out some- 
one in the family member.* Example (22b), in contrast, has a NON-SPECIFIC reading: the 
existence of a set of individuals is completely undefined in previous discourse. Subjects 
marked by no, unlike ga-marked subjects, can appear preceding the wo-marked object. 
Unmarked subjects, in contrast, appear strictly adjacent to the verb. Yanagida (2007) 
provides quantitative data for zero-marked subjects in the Man’yoshu. For a total of 667 
zero-marked subjects found in Man'yoshü, 580 occur immediately adjacent to the verb 
and 9 instances of non-conclusive transitive clauses have the pattern [Object wo Subject 
Ø V], given in (22b). These examples, however, without exception, appear in main clauses 
(Yanagida 2007: 183). Transitive subjects are never marked zero in embedded clauses.? 

The word order facts indicate thatOJ nominalized clauses employ DSM in parallel 
to DOM associated with a specific/non-specific distinction. They are configurationally 
determined in the syntax. While the zero-marked subject of transitive verbs remains 
in the external argument position, namely the specifier of vP, the subject marked by 
genitive moves to the specifier of TP. This is represented in (23). 


HI assume that SPECIFIC entities presuppose the existence of a set of individuals; the set of individuals is 
discourse-linked and refers to a previously mentioned set (cf. Enc 1991). 

As noted above,OJ displays main/embedded split case systems. In main clauses, the sub- 
jectYanagida2007]daigos of both transitive and intransitive verbs are marked by zero. 
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(23) CP 


DSM in Syntax 


The genitive subject construction (23) has a nominative-accusative pattern; the geni- 
tive subject is case-licensed by Cp, yz], and the accusative object is case-licensed by v. 


4 The historical development of nominative ga 


It is well known that ga in both possessor and subject/agent marking functions drasti- 
cally decreased after OJ. The ratios between ga and no in the Man yoshü (OJ; 8th century) 
and in Genji monogatari (EMJ; 11th century) taken from the Corpus of Historical Japanese 
(CHJ) produced by the National Institute of Japanese Language and Linguistics (NINJAL) 


are given below 


Table 2: The ratios between ga and no in the Man yoshü (Koji 1988) 


-ga -no 
Subject 372 (48%) 411 (52%) 
Possessor 606 (10%) 5711 (90%) 


Clausal subject 


19 (100%) 


0 


These two tables indicate that ga in both subject and possessor functions was sig- 
nificantly reduced in Genji monogatari, written in the EMJ period. In Genji, 39 out of 


BIn Table 3, the quantitative data taken from the corpus is limited to the sequence of Noun+ga/no Verb (Sub- 
ject), Noun+ga/no+Noun (Possessor), and Adnominal Clause» ga/no +Verb respectively, due to the design 
of the corpus. It is therefore not precisely the total occurrence of ga/no in the subject/possessor/clausal 


patterns. 
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Table 3: The ratios between ga and no in (Genji, ca. 1010, CHJ) 


-ga -no 
Subject 57 (4%) 1358 (96%) 
Possessor 78 (0.7%) 11302 (99.3%) 
Clausal subject 261 (98%) 4 (2%) 


57 tokens of ga-marked subjects are personal pronouns, of which 24 are first person 
waga, which was already the lexicalized first person pronominal form for both posses- 
sor and subject. In contrast, instances of ga marking clausal subjects which select psych- 
predicates, as illustrated in (18), drastically increases after 9j. 4 

A further significant change in EMJ is that the OSV dominant order associated with 
ga was completely lost. This change directly results from the fact that transitive subjects 
came to be either zero-marked or marked by genitive no as in (24), resulting in [S (no) O 
wo V] basic word order, as represented in (23): 


(24) Early Middle Japanese (Papakigi; Genji) 
[ki no miti no  takumi] no  yorodu no mono wo tukuri 
wood GEN tool GEN craftsman GEN various GEN thing OB] make 
idasu mo 
out EXCL 


"Ihe craftsman invents various things: 


These observations suggest that EMJ is characterized as displaying the transition from 
an active system to an accusative system. In the following (§4.1-§4.3), I will discuss three 
possible scenarios for this shift in alignment in the history of Japanese. 


4.1 Scenario 1: Antipassive » Accusative 


A number of researchers propose that alignment change from ergative/active to ac- 
cusative arises as a result of reanalysis of antipassives (cf. Harris & Campbell 1995; Bittner 
& Hale 1996; Aldridge 2012).* The transition from ergative to accusative begins when the 
oblique object in antipassives is reanalyzed as accusative. This explanation for alignment 
change may be applicable to ergative languages that have antipassive constructions. Not 
all languages do, of course: Polinsky (2013) and Comrie (2013) identify 14 ergative and 


“The CHJ corpus is not designed to make distinctions between clause types. However, it is well known 
among traditional Japanese grammarians that the subject marker ga/no is restricted to what Yanagida 
& Whitman (2009) identified as nominalized clauses inOJ and EMJ. While no remains genitive marker 
throughout the history, ga started to mark the subject in main clauses in Late Middle Japanese (see Table 5 
cited from Yamada 2000). By this period, the adnominal endings have been reanalyzed as matrix clause 
endings. 

IIn antipassives, the external argument has absolutive status rather than ergative, while the notional object 
is either dropped or marked as an oblique. 


415 


Yuko Yanagida 


2 active languages with no antipassives.OJ had no antipassives. Thus the reanalysis of 
antipassives is not a possible diachronic pathway from non-accusative to accusative for 
Japanese. 


4.2 Scenario 2: Active » Nominative 


Harris & Campbell (1995: 258) describe as a possible but hypothetical change a shift from 
active to accusative alignment caused by reanalysis of an active case marker as nomina- 
tive.ó King (1988) suggests a somewhat similar hypothesis on the basis of the view that 
the Korean nominative marker -i was originally an ergative marker that underwent a 
shift to nominative, as shown in Table 4. King hypothesizes that -i originates as an erga- 
tive case and the nominative function of -i arises as a result of ergative -i coming to mark 
intransitive subjects. 


Table 4: Alignment change in Korean (King 1988) 


Direct Object Subject Subject 
Intransitive Transitive 
Before change: Ergative Ø Ø -i 
After change: Accusative  D/-l D/-i D/-i 


Whitman 8 Yanigada (2015) show that King's hypothesis is not supported by the Ko- 
rean data. In the case of Japanese, Mod] nominative ga does not directly descend fromOJ 
genitive ga used to mark active subjects. Ga became highly infrequent as an NP subject 
marker in EMJ around the 9-10th centuries. 

Yamada (2000) examines the reappearance of ga as nominative in the text known as 
the Amakusa Heike, which was published in the late 16th century." Table 5, cited from 
Yamada (2000), shows that while subject marker ga was restricted to embedded clauses 
inOJ and EM], it started to reappear in main clauses in Late Middle Japanese (LMJ). 


Table 5: Ga in main clauses (Amakusa Heike 1592, Yamada 2000). 


Genitive transitive — unergative adjective unaccusative total 
ga 0(0%) 2 (2%) 13 (16%) 15 (18%) 54 (64%) 84 (100%) 


According to Yamada, nominative ga in LMJ starts out as a marker for the subject 
of intransitive verbs, in particular, unaccusative verbs, and rarely marks the subject of 


léKlimov (1974; 1977) also suggests that the development from active into nominative is a widespread 
development. 

The Amakusa Heike is a romanized version of the Heike Monogatari. It was composed as a textbook to teach 
Japanese to foreign missionaries. 
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transitive verbs. Ga appears on transitive subjects after the mid 17th century. Table 6 
presents the ratios between ga and no in the Toraakira-bon Kyogen published in 1642. 


Table 6: the ratios between ga and no (Toraakira bon, 1642, CHJ) 


-ga -no 
Subject 1622 (76%) 503 (24%) 
Possessor 353 (7%) 5267 (93%) 
clausal subject 20 (100%) 0 (0%) 


The data in the Toraakira bon reveal that transitive clauses came to appear in the 
canonical [S ga O o V] pattern in EModJ (1600-1800), as shown by the data in (25): 


(25) Early Modern Japanese (Toraakira bon, 1642) 
ano mono ga orusu o itase-ba 
that person NOM  watch.house Acc do-if 


“if that person watches over the house... 


These facts raise a basic question concerning the assumption that case systems shift 
from active to accusative: IfOJ active ga is the ancestor of Mod] nominative ga, why did 
ga decrease drastically in frequency in EMJ only to reappear in unaccusative rather than 
transitive verbs. 

To account for these facts, I propose a third scenario; that is, a global shift from active 
to nominative never took place in Japanese. Instead, change in the semantic features of 
individual case markers, ga and wo, reorganized the overall grammatical structure of the 
language. 


4.3 Scenario 3: Impersonal psych transitive > Intransitive 


Japanese is a so-called pro-drop language throughout its history; sentences often contain 
no overt subject. This means that learners ofOJ were presented with scant evidence that 
the object moved to the left of the subject, since direct evidence for OSV would be avail- 
able only in sentences with overt subjects. As a result, object movement was eventually 
lost. The loss of object movement then results in a reanalysis of wo as a pure structural 
accusative case. The reanalysis of wo subsequently led to another change. That is, ga- 
marked subjects were unable to remain in the specifier of vP. Yanigada (forthcoming) 
proposes that this is attributable to the subject in-situ generalization (SSG), originally 
proposed by Alexiadou & Anagnostopoulou (2001). The SSG is analyzed as the general 
condition on structural case, which states that if two DP arguments are merged in the 


18Frellesvig et al. (2018 [this volume]) argue that DOM is no longer operative in EMJ. In EMJ, wo was estab- 
lished as the structural accusative case. Its range of use was expanded to mark direct objects even with 
non-specific reading. Because of this change, the division between wo marked objects and unmarked ob- 
jects became semantically opaque. 
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vP domain, at least one of them must externalize. Alexiadou & Anagnostopoulou (2001) 
argue that the SSG applies synchronically in a variety of constructions across languages. 
I suggest that the SSG provides a diachronic explanation for the loss of ga marked sub- 
jects of transitive verbs. That is, once wo was reanalyzed as structural accusative and the 
object remained inside vP domain, the subject was no longer able to stay in the specifier 
of vP; it had to move outside vP. This results in the dramatic increase in tokens of the 
[DP no DP wo V] construction (23) in EMJ. 

Recall that (26) is the impersonal psych transitive construction that involves an im- 
plicit first person experiencer object. 


(26) Old Japanese (MYS 4338) 
[papa | wo panarete yuku] | ga _ kana-si sa 
mother OBJ] part go.ADN AGT sad-do  NMLZ 


“Tam sad about parting from Mother! 


As shown in Table 3 above, examples like (26) significantly increased in frequency 
after OJ. Some examples are given in (27) cited by Ohno (1977: 142). Ohno (1977; 1987) 
observes that in EMJ, adnominal clauses marked by ga are used predominantly with 
psych predicates with a first person experiencer (27a), as is the case in OJ, but that they 
began to appear with non-psych intransitive verbs (27b). 


(27) Early Middle Japanese (Kocho/Genji, Usugumo/Genji) 


a. [kokorobape wo  mi-ru] ga  wokasi-u mo 
kindness ACC see-ADN AGT thankful-concL EXCL 


‘Seeing (someone's) kindness makes (me) thankful’ 


b. [kumo no usuku watare-ru] ga nibi iro na-ru 
cloud GEN shallow.pass away-ADN AGT red color become-ADN 
wo 
EXCL 


‘the clouds passing thinly away become red’ 


In (27b) the adnominal clause marked by ga is the subject of a non-psych intransitive 
verb, and it involves no implicit first person experiencer. A further change in EMJ is that 
while this psych predicate construction was used only in nominalized clauses in OJ, it 
came to appear in non-nominalized main clauses as in (27a). Based on MJ (800-1600) 
data, I hypothesize that ModJ nominative ga is descended from ga marking the clausal 
complements of psychological predicates. Following Ohno’s (1977; 1987) observations 
and data collected from the corpus, nominative ga developed as a result of a reanalysis 
of impersonal psych-transitive as unaccusative intransitive where the ga marked argu- 
ment came to be the sole argument of the predicate, that is, nominative. Ga reappeared 
in LMJ as a nominative postposition, marking the theme argument of intransitives, and 
it was extended to mark the subjects of transitive verbs in EModJ. This scenario gives 
a straightforward explanation for why nominative ga started to mark the subject of in- 
transitive verbs, as observed by Yamada (2000). 
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5 Summary 


Ihave argued that the semantic opposition between case marked vs. zero marked subjects 
inOJ nominalized clauses show two types of DSM effects which fit with well-established 
cross-linguistic patterns. I have also argued that the reanalysis of wo as structural ac- 
cusative is a direct cause of the loss of active ga marking the subject of transitive verbs. 
The quantitative data in EMJ and LMJ suggest that nominative ga emerges as a result of a 
reanalysis of psych-transitive predicates as intransitive where the ga marked argument is 
the sole argument of the predicate. It has been widely believed that case systems change 
from non-accusative to accusative or accusative to non-accusative alignment. TheOJ data 
support the view that case systems do not merely shift from one alignment to another 
due to a single change. Instead, a cascade of changes in the morphological/semantic fea- 
tures of individual case markers, as exemplified byOJ and EMJ ga and wo, occur over 
time, eventually leading to overall change of case marking systems in a given language. 


Digitized texts 


e The Japanese Historical Corpus, the National Institute of Japanese Language and 
Linguistics, https://maro.ninjal.ac.jp/ 


* The Oxford Corpus of Old Japanese, http://vsarpj.orinst.ox.ac.uk/corpus/ 


e Man'yósháü Kensaku, Yamaguchi University 
http://infux03.inf.edu.yamaguchi-u.ac.jp/-manyou/ver2 2/manyou.php 


Abbreviations 
ABS absolutive HON honorific 
ACC accusative IMPERF  imperfective 
ADN adnominal LOC locative 
AGT agent MID middle 
ASP aspect MOD modal 
AUX auxiliary verb NEG negative 
CONC concessive NMLZ nominalizer 
CONCL conclusive NOM nominative 
CONJ conjunctive NONFUT  non-future 
CONT  continuative OBJ object marker 
DAT dative PST past 
DIM diminutive PL plural 
ERG ergative PRT second position particle (an 
EXCL exclamative evidential) 
F female PERF perfective 
FOC focus marker 1P first person 
FUT future 2P second person 
GEN genitive Q question particle 
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Chapter 15 


The partitive A: On uses of the Finnish 
partitive subject in transitive clauses 


Tuomas Huumo 


University of Turku, University of Tartu 


Finnish existential clauses are known for the case marking of their S arguments, which 
alternates between the nominative and the partitive. Existential S arguments introduce a 
discourse-new referent, and, if headed by a mass noun or a plural form, are marked with the 
partitive case that indicates non-exhaustive quantification (as in “There is some coffee in the 
cup”). In the literature it has often been observed that the partitive is occasionally used even 
in transitive clauses to mark the A argument. In this work I analyze a hand-picked set of 
examples to explore this partitive A. I argue that the partitive A phrase often has an animate 
referent; that it is most felicitous in low-transitivity expressions where the O argument 
is likewise in the partitive (to indicate non-culminating aspect); that a partitive A phrase 
typically follows the verb, is in the plural and is typically modified by a quantifier (man, ‘a 
lot of”). I then argue that the pervasiveness of quantifying expressions in partitive A phrases 
reflects a structural analogy with (pseudo)partitive constructions where a nominative head 
is followed by a partitive modifier (e.g. ‘a group of students”). Such analogies may be relevant 
in permitting the A function to be fulfilled by many kinds of quantifier + partitive NPs. 


1 Introduction 


The Finnish argument marking system is known for its extensive case alternation that 
concerns the marking of S (single arguments of intransitive predications) and O (object) 
arguments, as well as predicate adjectives. In this system, the partitive case plays a cen- 
tral role: it alternates with the accusative in the marking of O (see e.g. Heinámáki 1984; 
1994; Kiparsky 1998; Huumo 2010; 2013, and 83 of this paper), and with the nominative in 
the marking of both S (in existential clauses; see e.g. Huumo 2003) and predicate adjec- 
tives (see Huumo 2009). By contrast, A arguments are, in principle, always in the nomi- 
native in Standard Finnish. However, since the late 19th century, scholars have pointed 
out that the partitive is occasionally used even in the marking of A arguments, in spite 
of the fact that until the 1990's, the Finnish language planning authorities condemned 
such uses as ungrammatical. 


Tuomas Huumo. The partitive A: On uses of the Finnish partitive subject in 
transitive clauses. In Ilja A. Serzant & Alena Witzlack-Makarevich (eds.) Di- 
| achrony of differential argument marking, 383-411. Berlin: Language Science Press. 
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In this paper, I study such partitive A arguments with data manually gathered from the 
Internet. I will argue that the partitive A combines features of canonical (nominative) A 
arguments (animate and agentive referents are typical) and existential S arguments (the 
referent is typically discourse-new and indicates non-exhaustive quantity). I will also ar- 
gue that the most typical context for the partitive A are low-transitivity expressions. This 
is the main reason why not only A but also O is in the partitive case in most instances: 
when marking the O (of an affirmative clause) the partitive indicates a non-culminating 
event or a quantitatively non-exhaustive reference. If the accusative O is used with a 
partitive A, then the reading of A is distributive: each of its referents participates in the 
event individually. 

In the marking of S arguments, the Finnish partitive subject! is best known for its use 
in existential clauses, where the partitive case marks subject NPs headed by mass nouns 
or plurals. The referent of the existential partitive S is typically discourse-new and con- 
sists of a non-exhaustive quantity of a substance (mass nouns) or of a multiplicity (plu- 
rals); a characteristic feature of the partitive is non-exhaustive reference (an indefinite, 
open or unbounded quantity in different terminologies), whereas its counterparts, the 
nominative and the accusative, typically indicate exhaustive reference (definite, closed, 
or bounded quantity). Consider (1) and (2), which are canonical existential clauses with 
a clause-final S; for the uses of the partitive subject in old literary Finnish, see De Smit 
(2016). In existential clauses, only S arguments headed by a singular count noun take the 
nominative case (3). 


(1) Kupi-ssa on kahvi-a. 
Cup-INE  be.PRs.3sG  coffee-PAR 


“There is coffee in the cup. 


(2) Leikkikentá-lla ^ juokse-e laps-i-a. 
playground-ADE  run-PRs.3sG  child-PL-PAR 


"Ihere are children running in the playground: 


(3) Póydá-llá on kirja. 
table-ADE  be.PRs.3s56 book.NOM 
“There is a book on the table’ 


As can be seen from (2), the partitive S does not trigger verb agreement: the verb is 
in a 3rd person singular form in spite of the plural partitive. The typical position of the 
existential S arguments is after the verb, but since the Finnish word order is discourse- 
pragmatically conditioned (for details, see Vilkuna 1989: 35-62), existential S arguments 
may also have a clause-initial position. On the other hand, indefinite or focused non- 
existential S arguments are likewise often placed towards the end of the clause (see 


The subjecthood of this element is under dispute; see Huumo & Helasvuo (2015) and the literature men- 
tioned there. However, the term subject is conventionally used for it, and I follow this practice for 
convenience. 
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Vilkuna 1989: 187-191). This means that word order is not a reliable criterion in distin- 
guishing existential and intransitive (4) constructions from each other (cf. also Huumo 
& Lindstróm 2014, for a comparison of Finnish and Estonian). 


(4) Kirja on poydá-llá. 
book. vom  be.PRs.3sG  table-ADE 
“The book is on the table? 


In Finnish linguistics, a lively debate on existential clauses and the functions of the 
partitive S has been going on since the 1950's. It has been pointed out that though actual 
usage concentrates around certain semantically pale existential verbs (e.g. olla ‘be’ and 
tulla (come), the range of (intransitive) verbs available for the existential construction 
is actually wide and includes even highly agentive verbs such as juosta ‘run’, opiskella 
“study”, tapella ‘fight’ and tanssia “dance”. In earlier works, scholars attempted to build 
exhaustive lists of “existential verbs”, but in the 1970's this attempt was more or less given 
up. More recent analyses (e.g. Huumo 2003, Huumo & Helasvuo 2015), with Schlachter 
(1958) and Siro (1974) as their early predecessors, have emphasized the construction-level 
meaning of the existential clause, arguing that the construction backgrounds the activity 
indicated by the verb and foregrounds the locational relationship that prevails between 
the typically clause-initial locative adverbial and the existential S, the referent of which 
is introduced as a discourse-new entity within the location. 

In many works on Finnish existential clauses, it has been pointed out that the partitive 
subject is occasionally used even in transitive clauses, especially if the verb and the object 
form an idiomatic phrase (5) but sometimes in other low-transitivity predications as well 
(6)-(7) (e.g. Siro 1964: 77; Ikola 1972; Saarimaa 1967; Penttilä 1963; Yli-Vakkuri 1979: 156- 
157; Hakulinen & Karlsson 1979: 167-168; Sands & Campbell 2001; a generative approach 
is Nikanne 1994). 


(5) Use-i-ta sotila-i-ta sa-i surma-nsa taistelu-ssa. 
several-PL-PAR soldier-PL-PAR  get-Psr.3sG  death-Acc.3Poss  battle-INE 


“Several soldiers got killed (literally: ‘got their death’) in the battle. 
(6) Mon-i-a ihmis-i-à odott-i satee-ssa — bussi-a. 
many-PL-PAR  person-PL-PAR  wait-PST.3SG rain-INE — bus-PAR 
“Many people were waiting for the bus in the rain? 
(7) Keitto-a ^ seuras-i erilais-i-a liha-, kala- ja 
soup-PAR follow-Psr.3sG  differentd-PL-PAR meat fish and 


vihannes-ruok-i-a. 
vegetable-dish-PL-PAR 


The soup was followed by different dishes of meat, fish, and vegetables’. (Ikola 
1972) 


Especially during the 20th century, such uses attained the attention of language plan- 
ning authorities (e.g. Saarimaa 1967; Ikola 1972; 1986: 139) who considered them errors 
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and recommended the use of the nominative instead (e.g. usea-t sotilaa-t sai-vat... [many- 
PL.NOM soldier-PL.NOM get-PST.3PL] in (5)). A more tolerant approach was adopted by 
Itkonen (e.g. 1988 and the more recent Itkonen & Maamies 2007) who accepted the par- 
titive A in transitive constructions that are semantically close to (and can be rephrased 
by) existential clauses proper, such as (5) above, but condemned the wider use as “going 
against language intuition”. 

In an insightful paper, Yli-Vakkuri (1979) analyzed the partitive A? with data consist- 
ing of examples from the earlier linguistic literature, as well as a hand-picked set of 
39 examples she collected from literary fiction, newspapers and spoken discourse. In 
her data, the following transitive verbs and verb-object combinations are used with the 
partitive A: 1) seurata ‘follow’, kohdata ‘meet’, odottaa ‘wait’; 2) expressions where the 
object elaborates the activity designated by the verb rather than introduces a referent, 
as in “play cards’ or “sing hymns’, 3) perception verbs such as kuunnella ‘listen’ and kat- 
sella "watch" 2 Yli-Vakkuri also observed that in most instances the subject NP includes 
a quantifying expression, as in examples (5) and (6) (several, many”). Unquantified sub- 
ject phrases that consist of a partitive-marked noun alone occurred only three times 
in her 39 examples. In addition, there were three instances where the NP included an 
adjectival modifier. The rest of Yli-Vakkuri's examples, 33 instances, include a quantify- 
ing element. The role of the quantifying element thus seems to be central and will be 
discussed below (86) in more detail. 

The partitive A appears to be quite rare in text corpora. In the Syntax Archives cor- 
pus at the University of Turku there is only one occurrence of a partitive NP used as a 
transitive subject in written? text materials (8). 


(8) tulirokko-a seura-a usein  jálkitaute-j-a 
scarletfever-PAR  follow-PRs.3sG often  complication-PL-PAR 


“Scarlet fever is often followed by complications. 


Example (8) has a partitive A in the clause-final position. It is a plural form introduc- 
ing a discourse-new, quantitatively non-exhaustive multiplicity ( [some] complications”) 
and thus resembles the canonical existential S in (1)-(3). As is typical in existentials, the 
verb in example (8) does not agree in number with the plural partitive A — in general, 
verbs never show agreement with a partitive S in number or person in Finnish, and such 
a use would be blatantly ungrammatical for the native speaker's ear? 


2Yli-Vakkuri did not use this term. 

3It may be worth pointing out that the verbs in groups 1 and 3 assign the (aspectually motivated) partitive 
case to their object, while the verbs in group 2 also allow the accusative object, if the culmination of the 
event is indicated (to sign a hymn from the beginning to the end”). However, in their uses with the partitive 
A, group 2 verbs take the partitive object, which then indicates progressive aspect. For object marking in 
Finnish in general, see 83. 

^In the spoken dialect materials of the Syntax Archives, there are some occurrences where the pronoun 
ke-tá [who-PAR] is used as a transitive subject. However, such occurrences do not count as instances of 
the partitive A proper, because this partitive form of the pronoun kuka ‘who’ is productively used in the 
function of the nominative in the southwestern dialects of Finnish; see also the discussion of the quantifier 
monta (which is morphologically a partitive form but behaves like a nominative syntactically) in (86.2). 

5Serzant (2015: 395) points out that in North Russian, as well as in Veps, which is closely related to 
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The Syntax Archives corpora thus suggest that the partitive A is indeed a rare phe- 
nomenon. However, its rarity in edited written texts may be caused by language plan- 
ning authorities, who have considered usages like (8) as errors. Being aware of this, the 
authors and editors of the texts in the corpora may have avoided using it. Unedited 
Internet texts turn out to be a more fruitful source, where it is relatively easy to find 
occurrences, if only one knows what to look for. 

In this work, I use hand-picked (or Google-picked) data to discuss the partitive A. As 
my starting point, I have the set of examples from Yli-Vakkuri's (1979) work, as well as 
another hand-picked set of 20 examples (courtesy of Jaakko Leino), including examples 
such as (9) and (10): 


(9)  Jo-ta-in unkarilais-i-a esitt-i siellà 
some-PAR-CLIT Hungarian-PL-PAR  perform-PsT.3sG there 
kansa-n-tansse-j-a. 
folk-GEN-dance-PL-PAR 


“Some Hungarians were performing folk dances there. (from a private 
conversation; example courtesy of Jaakko Leino) 


(10) (Miten on mahdollista, että) 
korti-n on saa-nut henkilö-i-tä, 
card-Acc have.PRS.3SG  get-PTCP person-PL-PAR 
(joilla ei ole mitään yhteyttä keskustaan?) 
“(How is it possible that) the card has been given to persons (who have no 
connection with the Centre [a political party])?' (Verkko-Ilta-Sanomat 28.5.2010, 
example courtesy of Jaakko Leino) 


In example (9), the partitive A is clause-initial, animate and agentive. It differs clearly 
from the partitive A in example (8), which is more similar to existential S arguments 
by being clause-final and inanimate. In both (8) and (9), the partitive A is an indefinite 
plural form. In example (10), the partitive A is clause-final and animate but not agentive. 
Another special feature of (10) is that the object kortin is in the accusative case — note 
that in the earlier examples, (8) and (9), not only the partitive A but also the object NP 
(O) has been in the partitive. Indeed, it seems that the partitive A favors contexts where 
O is likewise in the partitive. This is a striking feature, because it results in two partitive- 
marked NPs being arguments of the same verb. 

In collecting data for this study with Google, I have used search strings with a spe- 
cific verb form that is either preceded or followed by a partitive form of a semantically 
schematic noun, or by a quantifying expression that typically combines with a partitive- 
marked noun to form an NP. Such data are of course extremely biased and give no ground 
for a statistical analysis. Nevertheless, as a result I have a set of attested 117 examples 
from actual language use, and it would be easy to expand this set by further searches. 


Finnish, the verb sometimes shows number and person agreement with a plural partitive subject (see also 
Koptjevskaja-Tamm & Wálchli 2001). 
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This means that my study can give a picture of contexts and constructions where the 
partitive A at least can be used in unedited written Finnish. I have also used my own 
and my colleagues' native speaker intuitions for grammaticality judgments of the ex- 
amples. Against the background that language planning authorities have considered the 
partitive A an error, it may be surprising that practically no example in my data set feels 
blatantly ungrammatical to the native speaker's ear. 

More specifically, the following points of view will be brought up in this work; points 
1-3 concern the synchronic distribution of the partitive A while point 4 also has di- 
achronic connotations. 


1. What is the partitive A like in terms of its own grammatical structure and lexical 
semantics (e.g. animacy)? What is its semantic role in the clause? What kinds of 
verbs does it occur with? 


2. What is the role of the quantifiers that seem to be typical in NPs with the function 
of a partitive A? 


3. Why is the object NP also in the partitive in most instances? When is the ac- 
cusative object used? 


4. What is the motivation for using the partitive case in a transitive subject NP? Are 
there grammatical systems in the language that pave the road for the partitive 
subject to spread into transitive clauses? 


I will discuss the grammatical and semantic features of the partitive A and the range 
of verbs in my data in 82. 83 concentrates on the object NP and its case marking, and 
$4 on agreement and word order. 85 discusses transitive infinitival constructions used 
as adverbial modifiers in matrix clauses that have an intransitive verb and a partitive 
S, arguing that such constructions probably give analogical support to the partitive A. 
$6 discusses the role of the quantifiers that are common in NPs with the function of the 
partitive A, from the point of view ofthe argument put forward by Yli-Vakkuri (1979) that 
the quantitative function of the partitive NP with a quantifier is fundamentally different 
from that of a bare partitive form. $7 sums up the results of the study. In the following 
sections, all examples are from the Internet, unless followed by the symbol (C) which 
marks examples coined by the author as a native speaker of Finnish. 


2 Nouns and verbs in typical clauses with a partitive A 


In general, the lexical range of nouns that can head the partitive A phrase (which is an 
NP) resembles that available for the partitive S: there are occurrences extending from 
inanimate nouns, as in example (8), to animate ones, as in example (9). However, even 
though the data used here do not permit statistical conclusions, it may be relevant that it 
is quite easy to find instances of the partitive A with an agentive human referent, which 
of course is a feature typical of (nominative) A arguments but less so for partitive S 
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arguments . In Yli-Vakkuri's (1979) set of 39 examples, animate referents dominate like- 
wise: there are only 10 examples with an inanimate referent, while in the majority of 
Yli-Vakkuri's examples, the referent of the partitive A is human. This suggests a differ- 
ence between the typical partitive S (referring to an inanimate THEME) and the typical 
partitive A (referring to a human AGENT). With respect to animacy, the partitive A thus 
resembles the nominative A which in the spoken-language data of Helasvuo (2001: 92) 
has a human referent in 91.4% of the cases. 

In the Google searches I performed, the verb seurata ‘follow’ turned out to be an es- 
pecially fruitful candidate to search for. It is common in Yli-Vakkuri's (1979) data as 
well. In my searches, seurata produced numerous hits with a partitive A, with both ani- 
mate and inanimate referents. The presence of a quantifying element in the partitive A 
phrase also appears to be typical: in my data the partitive-marked noun is often preceded 
by a quantifier even if the quantifier was not part of the search string. For instance, the 
string ihmis-i-à seuras-i [person-PL-PAR follow-PsT.35G] produced numerous hits like (11), 
where a numeral quantifier (here tuhans-i-a [thousand-Pr-PAn] ‘thousands [of]’, likewise 
in the partitive), precedes the partitive form ihmisid. However, it also produced (fewer) 
hits where the partitive noun constitutes the subject NP alone (12). 


(1) Tuhans-i-a ihmis-i-à seuras-i tapahtum-i-a ääneti. 
thousand-PL-PAR  person-PL-PAR  follow-Psr.3sG  event-PL-PAR silently 


“Thousands of people were following / followed the events silently: 


(12) (Esityspaikka oli täynnä, niin että) 
ihmis-i-à seuras-i puhe-tta esityspaika-n | ulkopuole-lla-kin. 
person-PL-PAR follow-Psr.3sG  speech-PAR  venue-GEN outside-ADE-also 
(The venue was full, so that) there were people following the speech even outside 
the venue: 


The more specific string si-tá seuras-i use-i-ta [it-PAR follow-PRs.3sc several-PL-PAR] 
“it was followed by several..? which specifies both the pre-verbal object NP (the pronoun 
‘it’ in the partitive case) and the post-verbal partitive-marked quantifier, produced inan- 
imate hits only. In those instances, the meaning of seurata “follow” is typically that of 
temporal succession, as in (13). 


(13) (Ensin kuultiin pienempi räjähdys ja) 
si-tà  seuras-i use-i-ta voimakka-i-ta räjähdyks-i-ä 
it-PAR  follow-Psr.3sG  several-PL-PAR powerful-PL-PAR explosion-PL-PAR 
luol-i-ssa. 


Cave-PL-INE 


(First a minor explosion was heard and) it was followed by several powerful 
explosions in the caves: 


As I mentioned above, the verb seurata ‘follow’ appears to be particularly common 
with the partitive A. This is comprehensible, because the verb is polysemous and also 
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has intransitive uses which make it available in existential clauses proper; consider the 
existential (14) where the initial elative (‘from’) phrase indicates a reason for the prob- 
lems that arise. 


(14) Teoria-sta-si seura-a ongelm-i-a. 
theory-ELA-2sG.poss  follow-PRs.3sG  problem-PL-PAR 


“Problems will follow from your theory? 


Since seurata “follow” appears to be a verb rather productively used with the partitive 
A, I used it as a test case to gather statistical information about the phenomenon from 
the Finnish Internet Parsebank.* In a dataset of approximately 4 million sentences, the 
program found altogether 7875 transitive uses of seurata “follow”, of which 13, i.e. 0.17 
%, have a partitive A. In all 13 instances the word order is OVS (overall, there are 6121 
SVO and 1190 OVS sentences), and the NP with the function of the partitive A includes 
quantifying elements in 9 instances. Among the four non-quantified instances, there is 
one with a genitive modifier, one with an adjectival modifier, and two that consist of the 
partitive-marked noun alone. This demonstrates that the partitive A is indeed quite rare 
in actual usage.’ 

Other transitive verbs that produced hits in my Google searches include, e.g. esittää 
‘perform’ (example (9) above), palvella ‘serve’ ((15) below), játtáá ‘leave’ (16), hakea "fetch; 
apply [for] (17), viettää “spend” (18), tarkkailla ‘observe’, lukea ‘read’, tuijottaa ‘stare’, 
nähdä ‘see’, tanssia ‘dance’, odottaa ‘wait’, tehdä ‘do’, kuunnella ‘listen’, and locative 
transitives such as ympäröidä ‘surround’ or reunustaa ‘rim’ (as in The park is rimmed 
by beaches), among others. As the choice of the verbs that were searched for were based 
on the two small hand-picked sets of examples I had, together with my intuition and 
"educated guessing", the list is of course not exhaustive. 


(15) Tuhans-i-a kelkko-j-a palvele-e matkailu-a. 
thousand-PL-PAR  sleigh-PL-PAR  serve-PRs.3SG  tourism-PAR 


“Thousands of sleighs serve tourism? (Newspaper headline, Helsingin Sanomat 
9.12.2000, example courtesy of Jaakko Leino) 


(16) Sato-j-a-tuhans-i-a ihmis-i-à játt-i Suome-n. 
hundred-PL-PAR-thousand-PL-PAR  person-PL-PAR leave-PsT.35G Finland-Acc 
“Hundreds of thousands of people left Finland. 

(17) Viera-i-ta ihmis-i-à hak-i tavaro-i-ta piene-stá 
strange-PL-PAR  person-PL-PAR fetch-PsT.35G  thing-PL-PAR  small-ELA 


vaaleanpunaise-sta huonee-sta. 
pink-ELA room-ELA 


Strange people were fetching things from the small pink room. 


$My cordial thanks are due to Veronika Laippala, Filip Ginter and Jenna Kanerva for the Parsebank data 
(for the Parsebank, see http://bionlp.utu.fi/finnish-internet-parsebank.html). 

7A fully automatic search that would be able to recognize partitive A constructions in the internet must be 
left for future research. 
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(18) Vasta.ranna-n móki-llà ihmis-i-à viett-i ilta-a 
opposite.shore-GEN  cabin-ADE  person-PL-PAR  spend-Psr.3sG  night-PAR 
ja soittel-i kitara-a. 
and  play-Psr.3sG  guitar-PAR 
At the cabin on the opposite shore there were people spending the night and 
playing the guitar’ 


It is worth noting that the partitive A seems to occur almost exclusively in the plural. 
It is very difficult to find hits where the partitive A is a singular form of a mass noun, as 
in Vilkuna's (1989: 261) example (19). 


(19) että  tietty-j-à historiallis-i-a muutoks-i-a seura-a 
that certain-PL-PAR  historical-PL-PAR  change-PL-PAR  follow-Pns.3sG 
välttämättä — jo-ta-kin muu-ta 
necessarily something<PAR>  else-PAR 


"that certain historical changes are necessarily followed by something else’ 


In sum, the partitive A seems to favor animate, often human referents, though inani- 
mate referents can also be found. This is comprehensible, because in most instances both 
A and O are in the partitive (for reasons that will be discussed in §3), and animacy of 
A is then one factor that keeps them apart. The semantic role of the animate referent is 
often agentive, as in (16)-(18), while inanimate referents are more typical in clauses that 
express a transitive locative (e.g. ‘surround’) or a temporal (e.g. ‘follow’) relationship. In 
spite of the agentive role of the animate NPs, the data show a low level of transitivity, 
as will be seen in the following section. 


3 Case marking and the aspectual function of the object 
NP 


As the examples discussed so far show, many verbs that occur with the partitive A are 
agentive, or at least such that they allow an animate subject. It is also noteworthy that 
in most instances not only the subject but also the object NP is in the partitive and not 
in the accusative, which would be the other option and which might be expected for 
morphosyntactic reasons (i.e. to differentiate between A and O by case marking). In Yli- 
Vakkuri’s (1979) set of 39 examples, there are 10 instances with an accusative object and 
27 with a partitive object; in two examples the construction is elliptical and lacks an 
overt object . 

In general, the accusative? vs. partitive opposition in Finnish object marking reflects 
three features (see e.g. Heinämäki 1984; 1994; Kiparsky 1998; Huumo 2010; 2013): 1) ex- 


Hin this paper, following the convention of traditional Finnish grammars, I use the term accusative object as 
a syntactic cover term for all objects that are not in the partitive case. In morphological terms, the category 
of accusative objects comprises a) singular objects (of personal constructions) with the historical accusative 
ending -n, b) nominative singular objects in imperative and passive constructions, c) the special accusative 
form of personal pronouns with the ending -t, and d) plural nominative objects. This morphologically 
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haustive [acc] vs. non-exhaustive [PAR] quantity, 2) culminating [acc] vs. non-culmi- 
nating [PAR] aspect, and 3) positive [acc] vs. negative [PAR] polarity. Condition 3) domi- 
nates in the sense that the partitive is used in all object NPs under negation, regardless of 
the two other conditions. In affirmative clauses, condition 2) dominates over condition 1) 
(as has been argued e.g. by Vilkuna 1996), in the sense that non-culminating aspect trig- 
gers the partitive irrespective of whether the quantity is exhaustive or non-exhaustive. 
It is only in instances where the aspect is culminating (e.g. in achievements such as 'I 
found some mushrooms?) that the partitive can indicate non-exhaustive quantity (for a 
more detailed hierarchy of the functions, see Huumo 2013). Thus the partitive signals 
non-culminating aspect in (20) and non-exhaustive quantity in (21). The accusative ob- 
ject is used if the clause is affirmative, designates the culmination of the event and has 
an object NP that designates a definite quantity; cf. the accusative versions of (20) and 
(21). Note that the singular partitive would be ungrammatical in (21) where the verb in- 
dicates a punctual achievement and thus the reading with non-culminating aspect (i.e. 
progressive) is excluded; likewise the quantity of the referent (book) cannot be under- 
stood as non-exhaustive, which would trigger the partitive. In (22) the verb is atelic and 
the aspect thus non-culminating; therefore only the partitive object is possible. 


(20) Rakens-i-n talo-n ^ talo-a. 
build-Psr-186 house-ACC ~ PAR 


‘T built [and completed] a/the house? [acc] / ‘I was building a/the house - built 
a/the house a bit ~ did some house-building’ [par] (C) 


(21) Lóys-i-n kirja-n ~ kirja-t ~  kirjo-a ~  *kirja-a 
find-Psr-1sc book-ACC ~  PLNOM ~ PL-PAR ~  "SG.PAR 
‘I found [a/the] book’ [sc.Acc] 
‘I found the books. [PL.vom] 
‘I found some books’ [rr-PAn] (C) 


(22) Ihaile-n Kallu-a- "Kallu-n. 
admire-PRs-1sG6 — Kallu-PAR-"AcC 


‘I admire Kallu’ (C) 


Both in the data I collected for this study and in the data analyzed by Yli-Vakkuri 
(1979), partitive objects are more common than accusative objects. Though this may be 
surprising in morphosyntactic terms, it is semantically reasonable when one considers 
the verbs that appear to be typical with the partitive A, i.e. verbs with meanings such 
as ‘serve’, ‘follow’, “observe”, ‘stare’, ‘dance’, ‘wait’ and ‘listen’. Most of these are low- 
transitivity verbs indicating activities that do not culminate. The partitive marking of 
the object then reflects this aspectual feature. Even in instances where the partitive A is 
used with an accomplishment verb (e.g. ‘perform’, ‘read’, ‘do’), the object is usually in the 
plural partitive that signals the non-exhaustive quantity of its referent(s). This means that 


heterogeneous category (as a whole) constitutes the counterpart of the partitive in the object-marking 
alternation based on the oppositions of quantification, aspect, and polarity. 
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the overall event consists of iterated accomplishments, the number of which is unknown, 
and therefore the aspect may be non-culminating in two ways: either by indicating a 
progressive meaning (‘Some Hungarians were performing dances’ in example (9) above) 
or by indicating a higher-level atelic event (‘Some Hungarians performed some dances’), 
in which case the partitive marking of the object NP (‘dances’) in example (9) means 
that the quantity of the dances performed by ‘some Hungarians’ was non-exhaustive. 
While performing one dance counts as an accomplishment, performing several dances 
in a row is an activity. It is also noteworthy that a singular accusative object tanssi-n 
[dance-Acc] would make example (9) less acceptable with its partitive A. This suggests 
the generalization that the partitive A is most acceptable in low-transitivity clauses that 
are aspectually non-culminating. 

However, there are also instances where the partitive A is used with an accusative 
object, both in my data and the in data of Yli-Vakkuri (1979). An interesting feature 
of such sentences is that the event is not collective but distributive: each referent of 
the partitive A (which is in the plural) achieves or accomplishes something individually. 
Consider example (23), which is from the webpage of a newspaper. 


(23) (Seon vanha palkinto, joka annetaan nuorille kirjailijoille,) 
ja se-n on saa-nut tosi hieno-j-a kirjailijo-i-ta. 
and it-acc  have.PRs.3sG  get-PTCP really fine-PL-PAR  writer-PL-PAR 


(It is an old prize given to young writers.) and some really fine writers have won 
it? 


In (23), the pronominal object se-n [it-acc] refers to a prize that each winning author 
has won once. Because ‘winning a prize’ is an achievement (i.e. an instance of culmi- 
nating aspect), the accusative object is used. It seems that the distributive reading is the 
factor that makes the accusative object in (23) acceptable. The importance of the dis- 
tributive meaning of the partitive A was also pointed out by Siro (1964: 77), who argued 
that a possible motivation for the use of the partitive in transitive subjects may be the 
avoidance of a collective interpretation which the nominative case might evoke. All ex- 
amples with an accusative object in Yli-Vakkuri’s (1979) data are likewise distributive, 
and attempts to form examples with a collective meaning result in ungrammaticality. 
Consider the coined example (24) where the (ungrammatical) accusative object would 
indicate the culmination of a collective accomplishment. The partitive object is better,” 
as it indicates non-culminating (in this case, progressive) aspect. Example (25) is an at- 
tested occurrence with a partitive object. 


(24) (?)Kirkko-a / *kirkon  rakens-i kymmen-i-à | sukupolvi-a. 


church-PAR ACC build-pst.3sG  ten-PL-PAR generation-PL-PAR 


"Tens of generations built [participated in the building of] the church: (C) 


?The question mark indicates the fact that such examples are considered ungrammatical in Standard Finnish 
and may not be acceptable for all speakers. 
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(25) Kymmen-i-á-tuhans-i-a ihmis-i-à rakens-i tá-tá 
ten-PL-PAR-thousand-PL-PAR  person-PL-PAR  build-Psr.3sG  this-PAR 
linja-a (ja me kävimme siis yhdessä bunkkerissa.) 
line-PAR 
‘There were tens of thousands of people building [participated in the building of] 
this [defense] line (and so we visited one bunker)? 


In the same vein, example (25) would be odd with the object in the accusative (támán 
linjan), to indicate that the people collectively built and completed the defense line. What 
example (25) (like the partitive version of the coined (24)) means is that the quantity of 
people referred to by the partitive A took part in the building. In spite of the transitive 
structure, an existential kind of meaning is involved (“There were tens of thousands of 
people who participated in the building of the defense line”). Note that in spite of its parti- 
tive object, which often indicates a progressive meaning, example (25) is not progressive 
in the sense of indicating a "cross-section" of an ongoing event where a non-exhaustive 
quantity of people are simultaneously participating. The participation ofthe people need 
not be simultaneous; the example rather means that there have been people involved in 
the building of the defense line at different times during its construction. In this respect, 
the partitive A resembles the canonical partitive S (of existentials), the reference of which 
may change as the event unfolds (see Huumo 2003 for details). In general, the aspectual 
meaning of the examples with a partitive A relates to non-culminating aspect: the events 
are atelic processes, or if telic (as the examples with 'build"), not understood as reaching 
their culmination. 


4 A note on word order 


As far as word order is concerned, the examples discussed so far show that the con- 
structions with the partitive A may have an AVO ((15)-(18)) as well as OVA ((8), (10), 
(13)) word order. In lack of systematic corpus data it is impossible to say which order is 
more common in actual usage, or whether the word order variants pattern around dif- 
ferent verbs. However, it is easy to see a motivation for both patterns: Finnish is an AVO 
language but has a discourse-pragmatically conditioned word order (see Vilkuna 1989: 
35-62) which allows indefinite subjects to occur in a postverbal position, not only in 
existential clauses but also in non-existential constructions, including transitive clauses. 
In actual usage, the postverbal position is typical of indefinite, structurally heavy subject 
NPs that introduce a discourse-new referent (for written language, see Huumo 1995; for 
spoken language, Helasvuo 2001: 75-81). One can thus see two competing motivations 
for the word order in transitive clauses with the partitive A: the AVO order that is typ- 
ical of Finnish transitive clauses, and the XVS order of existentials, combined with the 
tendency for indefinite subjects to occur towards the end of the clause. 

Because object NPs are also commonly in the partitive in the data, ambiguity may be 
expected to arise: which partitive NP is the subject and which one the object? It seems, 
though, that real ambiguity is rare in actual usage, because in many cases the lexical 
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meaning of the partitive A shows that it is the subject. For instance, the partitive A (but 
not O) is often animate in cases where the verb selects for an animate subject. Further- 
more, if the partitive marking of the object NP unambiguously reflects non-culminating 
aspect, not quantity, then that NP cannot be understood as the partitive A, which follows 
the rules of existential S marking in that the partitive indicates non-exhaustive quantity, 
not aspect. In spite of these facts, there are some ambiguous instances in my data. In (26) 
both A and O are plural partitive NPs with a human referent, and thus the example as 
such is ambiguous between the AVO and the OVA readings. 


(26) Sotila-i-ta seuras-i aina huolto.joukko-j-a ja 
soldier-PL-PAR follow-pst.3sG always  maintenance.troop-PL-PAR and 
kauppia-i-ta huolto.varmuude-n yllüpitümise-ksi. 
vendor-PL-PAR  maintenance.certainty-GEN — securing-TRA 


“Soldiers were always followed by maintenance troops and vendors to secure the 
maintenance: 


Even in this case, however, the context reveals that it is the maintenance troops and 
vendors who follow the soldiers (into conquered territories), not vice versa. The exam- 
ple is thus OVA. In purely grammatical terms, though, nothing would prevent the AVO 
reading, and in the coined, context-less example (27) the AVO and OVA interpretations 
are equal. 


(27) Tyttó-j-à seuras-i poik-i-a. 
girl-Pt-PAR  follow-Psr.3sG boy-PL-PAR 
‘[Some] girls followed [some/the] boys’ / ‘[Some/the] girls were followed by 
[some] boys: 


As the English translation of (27) shows, the partitive A is always indefinite but the par- 
titive O may be either definite or indefinite. If O is understood as definite (“the girls”, the 
boys”), then its partitive marking reflects the non-culminating aspect only. This is also 
the reason why example (28) below can only be an AVO instance where hántá ‘him/her’ 
is the grammatical object and not a partitive A: its partitive case is not motivated by a 
non-exhaustive quantity but by non-culminating aspect. 


(28) Kymmen-i-à, elei  sato-j-a sotila-i-ta seuras-i 
ten-PL-PAR if-NEG hundred-PL-PAR  soldier-PL-PAR  follow-PsT.3SG 
hán-tà. 
3SG-PAR 


“Tens if not hundreds of soldiers followed him/her? 


Another grammatical feature that relates to word order is the lack of subject-verb 
plural agreement in colloquial spoken Finnish (see e.g. Helasvuo 2001: 67) but also in 
nonstandard written varieties, such as Internet texts. In such varieties, the singular 3rd 
person verb form is used even with plural nominative subjects. In Standard Finnish, this 
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is considered an error - however, there is clearly a pressure from the colloquial varieties 
against plural agreement, and this pressure seems to be strongest in clauses where an in- 
definite plural nominative subject follows the verb. According to my observations, even 
university students of Finnish (who are educated to be specialists in the language) have 
difficulties in marking plural agreement if the nominative plural subject is indefinite and 
follows the verb. Keeping in mind that the partitive S does not trigger verb agreement, 
it is possible (as also suggested by De Smit 2016) that the decay of agreement, which is 
clearly manifest in spoken and nonstandard written Finnish, is another feature paving 
the road for the partitive marking to spread into indefinite plural subjects even in transi- 
tive clauses. When there is no agreement even with a (post-verbal) nominative subject, 
then constructions with a nominative vs. a partitive subject resemble each other in all 
respects except the case marking of the subject - in other words, there is no agreement 
to prevent the use of the partitive. 


5 Semi-transitive infinitival constructions 


If looked at in isolation, transitive clauses with a partitive A may appear striking, but 
there are in fact a few infinitival constructions, also acceptable in Standard Finnish, that 
bring the partitive S and an object NP close to being arguments of the same complex 
predicate. In the (coined) example (29), the predicate verb is intransitive and has a parti- 
tive S but also an infinitival modifier, traditionally parsed as an adverbial, consisting of 
a transitive verb which has its own object NP. 


(29) Turiste-j-a saapu-u ihastele-ma-an | rakennus-ta. 
tourist-PL-PAR  arrive-PRs.3sG admire-INF-ILL  building-PAR 


"Tourists arrive to admire the building? (C) 


Example (29) has an intransitive motion verb (‘arrive’) which is quite typical in exis- 
tential clauses. Therefore the partitive S is grammatical. The example also includes an 
infinitival form of the transitive verb “admire”, which in turn has a grammatical object 
but no subject argument of its own - the infinitive is controlled by the matrix verb in 
the sense that the A argument of the matrix verb is understood as the agent of the infini- 
tive as well. In traditional grammars of Finnish, such infinitival forms are analyzed as 
adverbials of the finite verbs, and since the object is part of the infinitival construction, 
it is not considered to be an object at the level of the matrix clause. If the relationship 
between the finite verb and the infinitive is relatively tight (i.e. if they are understood 
as forming a complex predicate where the function of the matrix verb resembles that of 
an auxiliary), then “almost-transitive” clauses arise where the partitive A and the O can 
be understood as arguments of the same complex predicate (not of different verb forms); 
consider (30). 
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(30) Mon-i-a lahjakka-i-ta ihmis-i-à on teke-má-ssá 
many-PL-PAR  talented-PL-PAR  person-PL-PAR  be.PRs.35G  do-INF-INE 
ulkopolitiikka-a. 
foreign.policy-PAR 
(mutta sitä tehdään omissa lokeroissa eivätkä eri osa-alueet kohtaa.) 

‘There are many talented people carrying out [our] foreign policy (but they do it 
in their individual lockers and the different areas do not meet). 


In (30), the finite verb is olla ‘be’, which, on the one hand, is the most typical existential 
verb, but, on the other hand, has functions as an auxiliary when it is combined with 
infinitival forms to form complex predicate constructions. The infinitival form in (30) is 
teke-mä-ssä, the so-called 3rd infinitive inessive form of the verb meaning ‘do’ (roughly 
translatable as 'in doing”). This infinitive often combines with the verb ‘be’ to form a 
progressive construction; cf. (31). 


(31) Ole-n luke-ma-ssa  tá-tá raportti-a-si. 
be-pRs.isG read-INF-INE  this-PAR  report-PAR-2SG.POSS 


“Tam reading this report of yours. (C) 


Though the Finnish olla (be, exist’) + the 3rd infinitive inessive (‘in-the-activity-of’) 
construction is not a fully grammaticalized progressive but maintains a locative-absen- 
tive meaning (by implying that the agent is absent from the location of the speech event, 
at another location where the activity takes place; cf. Markkanen 1979; Tommola 2000; 
Onikki-Rantajááskó 2005), it is nevertheless a more grammaticalized combination of an 
existential finite verb and its transitive infinitival “adverbial” modifier than the construc- 
tions in example (29). In constructions like (30), the partitive S and the O are close to 
being arguments of the same predicate. Yli-Vakkuri (1979: 165) also points out that in 
her data of the partitive A, many instances could alternatively be expressed by using the 
progressive construction, as they indicate an ongoing event. 

Note that the analogy of expressions such as (30) may also for its part explain why the 
partitive object is more natural than the accusative in transitive clauses with a partitive 
A. The partitive O can reflect different types of non-culminating aspect, among which the 
progressive meaning is a typical one. Thus if progressive constructions such as (30) give 
analogical support to the partitive A, then it is reasonable that the progressive meaning 
is also typical in transitive clauses with the partitive A. However, at a more general level 
it can be pointed out that both the partitive S and the partitive O associate with low 
transitivity? (in aspectual terms, atelic, progressive or cessative aspect as opposed to 
telic predicates such as accomplishments, cf. Huumo 2010). This may also motivate the 
dominance of partitive objects in clauses with the partitive A. 


However, as pointed out to me by an anonymous reviewer, it seems to be the case that not all low- 
transitivity constructions accept the partitive A. 
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6 The role of quantifiers 


In this section, I will take a closer look at the quantifier expressions that are typical in 
NPs with the function of the partitive A. Subsection $6.1 introduces and discusses dif- 
ferent types of mass (‘a lot of”, much”) and plurality (‘several’, “a few’) quantifiers that 
are common in this function, while subsection $6.2 concentrates on the singular quanti- 
fier moni ‘many’ (+singular), and its partitive form mon-ta, which has been reanalyzed 
as a nominative in many contexts and, as a consequence, given rise to the pleonastic 
double partitive mon-ta-a that explicitly indicates the function of a partitive. The form 
monta alternates between the functions of a nominative and a partitive and is typical in 
(partitive) A phrases as well. 


6.1 Quantifiers in the partitive A phrase 


A characteristic feature of phrases with the function of the partitive A is the presence 
of quantifying elements such as “several”, “a lot of”, as well as indefinite numerals that 
are themselves in the partitive case (‘hundreds / thousands of”). These quantifiers can 
be roughly divided into two groups depending on whether they are able to quantify 
both mass nouns and plurals (as the English a lot of coffee ~ a lot of cars) or plurals only 
("several coffee ~ several cars). I will refer to these two groups as mass quantifiers and 
plurality quantifiers, respectively (detailed analyses [in Finnish] include Hakulinen & 
Karlsson 1979; Huumo 2016a,b). Finnish plurality quantifiers, like adjectival modifiers 
in general, agree with their head (the quantified noun) in number and case (32), while 
mass quantifiers are fossilized forms not inflected in number and case (33). Both kinds 
of quantifiers are used in NPs with the function of a partitive subject (S or A). 


(32) Use-i-ta auto-j-a seiso-o piha-lla. 
several-PL-PAR  car-PL-PAR stand-PRS.35G  yard-ADE 


“There are several cars standing in the yard’ (C) 


(33) Paljon  auto-j-a seiso-o piha-lla. 
alot.of car-PL-PAR stand-PRS.35G  yard-ADE 


“There are a lot of cars standing in the yard’ (C) 


The tendency for partitive A phrases to include quantifiers was also observed by Yli- 
Vakkuri (1979). In her data of 39 examples collected from actual usage, 33 examples have a 
quantifying element preceding the partitive noun. Yli-Vakkuri also made a query to 103 
native-speaker informants regarding the acceptability of different subtypes of clauses 
with a partitive A. She found out that the clear majority of the informants considered 
versions with a quantifier more acceptable than those with a bare (unquantified) parti- 
tive noun form. She also asked the informants to correct the sentences they considered 
ungrammatical. The result was, remarkably, that many informants added a quantifier 
but maintained the partitive marking of the quantified NP instead of changing it into 
the nominative (Yli-Vakkuri 1979: 175). This raises the question about the central role of 
the quantifier in the partitive A phrases. 
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In my data gathered with Google, quantifying elements are also common, even ifthey 
were not searched for. For example, in the hits produced by the search string “ihmisiä 
seurasi” ('people[PAr] followed”; see the examples in $2), most hits where ihmisiä was 
a part of a partitive A phrase had some kind of a quantifying element preceding the 
form ihmisiá. The search also produced hits (not targeted for) where the partitive form 
ihmisid is a post-modifier of a nominative head with a collective meaning, such as ‘group’ 
or ‘team’, i.e. a collective that consists of a number of persons, as in (34) and (35) (which 
of course are not instances of the partitive A). 


(34) Täysi torillinen ihmis-i-à seuras-i Valoviikko-jen 
full market.place.full person-PL-PAR follow-pst.3sc  ligh.week-PL.GEN 
avajais-i-a Tamperee-lla. 
opening-PL-PAR Tampere-ADE 
‘A full market-place-full of people was following the openings of the 
Illuminations in Tampere. 


(35) Suuri joukko  ihmis-i-á seuras-i Schwarzeneggeri-n ja 
big crowd  person-PL-PAR follow-pst.3sc Schwarzenegger-GEN and 
olympiatule-n yhteis-tà | matka-a. 


olympic.fire-GEN ` joint-PAR  journey-PAR 
“A big crowd of people followed the journey of Schwarzenegger and the Olympic 
Flame: 


In (34) the head of the subject NP is the nominative form torillinen ‘market-place-full’, 
derived from the noun tori ‘market place’ to designate something that fulfills the whole 
market place. The partitive ihmisiä is a post-modifier of this noun. In example (35) the 
head noun of the subject NP, joukko ‘crowd’, is in the nominative, and it is followed by the 
partitive modifier ihmisiá “people”. These examples are thus not instances of the partitive 
A but illustrate a "legitimate" construction (from the point of view of language planning 
authorities) where the subject NP that contains a partitive form has the function of A. 
In the light of these examples, now consider (36)-(38). 


(36) Runsaa-sti ihmis-i-á seuras-i vappupuhe-i-ta 
abundant-Apv  person-PL-PAR follow-Psr.3sG  1st.of.May.speech-PL-PAR 
aurinkoise-lla mutta  tuulise.lla ` kauppatori-lla. 
sunny-ADE but windy-ApE  market.square-ADE 
'A lot of [lit. abundantly] people were following the 1st of May speeches on the 
sunny but windy market square. 
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(37) (Elvis Presleyn kuolema vuonna 1977 toi välittömästi yli 100 000 surijaa 
Gracelandin porteille, ja) 
sama-n verra-n ihmis-i-á seuras-i paika-n  páá-lla 
same-Acc amount-Acc  person-PL-PAR follow-PST.35G  spot-GEN  On-ADE 
háne-n  hautajais-i-a-an. 
3SG-GEN  funeral-PL-PAR-3POSS 
(Elvis Presley”s death in 1977 immediately brought over 100 000 mourners to the 
gates of Graceland, and) the same amount of people followed his funeral on the 
spot? 


(38) (Missä hän menikin, niin) 
paljon ihmis-i-ä seuras-i hän-tä. 
a.lot.of person-PL-PAR  follow-PsT.35G 3SG-PAR 
(Wherever He [Christ] went), a lot of people followed Him? 


In (36)-(38), the partitive form ihmisiä is preceded by a mass quantifier which is more 
abstract than the collective nouns of examples (34)-(35). It is not always clear whether 
the head of the subject phrase is the quantifier or the partitive. For example, the influen- 
tial Finnish syntax book by Hakulinen & Karlsson (1979: 147) mentions both possibilities 
for the analysis of such phrases, as either NPs or “quantifier phrases”. However, unlike 
the collective nouns in (34)-(36), the quantifiers in (36)-(38) are not referential: they do 
not designate a group or other kind of a collective that would be understood as the ac- 
tual referent of the phrase. For instance, in (36) the adverb runsaasti 'abundantly' used 
as a mass quantifier does not refer to a group but specifies the quantity indicated by 
the partitive ihmisiá “people”. This means that, in semantic terms at least, there are good 
reasons to consider the partitive-marked noun the head of the phrase. 

Morphologically, runsaasti is derived from the adjective runsas “abundant' by adding 
the adverb-forming affix —sti, in the same way as the English abundant-ly, which is se- 
mantically close to it. The quantifier paljon (38), in turn, is historically the accusative form 
of the quantifier paljo ‘multitude’ (cf. Tuomikoski 1978), which has grammaticalized into 
an opaque quantifier and only used in its accusative form in present-day Finnish (see 
Karttunen 1975 for the grammar of paljon). Though Karttunen (1975), following Penttilá 
(1963) considers paljon the head of the phrases such as that in (38), this element resem- 
bles runsaasti of example (36) in being a quantifier, not a noun, and there are equally 
good reasons to argue that the partitive form is actually the head and the phrase is an 
NP. The more recent comprehensive grammar (Hakulinen et al. 2004: 8657) states that 
such quantifiers occur "next to the NP" they quantify, hinting that the quantifiers might 
be external to the NP. The expression verran in (37) apparently has a similar background 
as paljon: it is a grammaticalized accusative form of the noun verta meaning ^worth' or 
‘match’ (as in He is no match to me). In any case, it is not referential in (37). 

In sum, all subject phrases in (36)-(38) include mass quantifiers that are not inflected 
and, for instance, cannot be pluralized, unlike the collective heads proper in (34) and 
(35), yielding jouko-t ihmis-i-á ‘groups of people’ (which in a subject position triggers 
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plural agreement in the verb in Standard Finnish). The collective nouns can also be case 
inflected, as in torillise-lle [ALLATIVE] ihmis-i-à ‘to a/the market-square-full of people’, 
where, irrespective of the case marking of the collective noun, the partitive postmodifier 
keeps its partitive in all contexts — this is another feature demonstrating that the collec- 
tive noun is indeed the head. The quantifying expressions in (36)-(38), in contrast, are 
not inflected and show no behavior of a head of a subject NP (i.e. do not trigger verb 
agreement). 

In terms of prescriptive grammar, transitive clauses such as (34)-(35) are acceptable, 
because the collective noun is the head of the subject NP and it is in the nominative. 
In contrast, examples (36)-(38) have been considered ungrammatical by some language 
planning authorities, because they bring the partitive subject into a transitive clause 
(in an analysis where the partitive is the head). However, it is easy to see a similarity 
between the two constructions, and it is very likely that expressions such as (34) and 
(35) serve as an analogy for the use of the partitive A with a quantifier as in (36)-(38). 
Note, furthermore, that verb agreement does not help to distinguish the head in examples 
like (36)-(38) in the way of the English alternation between A flock of geese is ~ are in 
the yard, where the verb form shows whether flock or geese is understood as the head 
of the subject NP (see Langacker 2009: 53). This is because the quantifiers in examples 
(36)-(38) cannot be morphologically pluralized (to trigger plural agreement in the verbs; 
note that they do not trigger semantic plural agreement either). 

On the other hand, plurality quantifiers agree with the quantified noun in number and 
case; see (39) and (40) below. 


(39) (Sitten huomasin, että) 
minu-a  tuijott-i use-i-ta silmá.pare-j-a varjo-i-sta. 
ISG-PAR stare-PST.35G  several-PL-PAR  eye.pair-PL-PAR shadow-PL-ELA 


(Then I noticed that) I was stared at by several pairs of eyes from the shadows’ 


(40) (vaikka nákisikin että) 
sato-j-a ihmis-i-à on luke-nut 
hundred-PL-PAR  person-PL-PAR have.PRS.38SG  read-PTCP 
viesti-si 
message-ACC.2SG.POSS 
(niin harva kuitenkaan vaivautuu vastaamaan) 


‘(Even though you see that) hundreds of people have read your message, (only 
few bother to answer you). 


Like examples (36)-(38), examples (39) and (40) include a quantifying element that pre- 
cedes the partitive noun. The difference is that in (39) and (40) the quantifying element 
is a plurality quantifier and therefore agrees with the partitive-marked noun. Such NPs 
thus seem to be partitive subjects indisputably. However, Yli-Vakkuri (1979) argues that 
in spite of the partitive of the quantifier, such phrases differ from unquantified partitive 
subjects which indicate a non-exhaustive quantity. The quantity indicated by phrases 
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such as those in (39) and (40) are, in Yli-Vakkuri's terms, quantitatively marked. This can 
be seen best by analyzing uses where such phrases have the function of a grammatical 
object; recall that the partitive marking of the object NP may reflect non-culminating as- 
pect or non-exhaustive quantity in affirmative clauses. Yli-Vakkuri (1979) demonstrates 
that the quantity expressed by phrases including a partitive quantifier (such as the sub- 
ject NPs in (39) and (40)) behaves like (in the current terminology) an exhaustive quantity 
in certain contexts. For instance, if the phrase use-i-ta ihmis-i-à [several-PL-PAR person- 
PL-PAR] has the function of a grammatical object, it behaves, in terms of quantification, 
like a plural accusative object (which is morphologically in the nominative case and in- 
dicates an exhaustive quantification), not like an unquantified partitive NP. This can be 
seen by considering the behavior of the durative modifiers tunni-n [hour-Acc] ‘for an 
hour’ vs. tunni-ssa [hour-INE] ‘in an hour’, which, like their English counterparts, are a 
good test indicator for non-culminating vs. culminating aspect, respectively. Consider 
the following examples. 


(41) Poim-i-n sien-i-à tunni-n (tunni-ssa). 
pick-pst.isG mushroom-PL-PAR Mhour-Acc (*INE) 


‘I picked mushrooms for (*in) an hour’ (C) 


(42) Poim-i-n siene-t tunni-ssa  (*tunni-n). 
pick-pst.isG mushroom-PL.NOM  hour-INE  (“acc) 


‘I picked the mushrooms in (*for) an hour? (C) 


(43) Poim-i-n use-i-ta sien-i-à tunni-ssa  ("tunni-n). 
pick-PsriscG  several-PL-PAR  mushroom-Pr.NoM hour-INE_  (“Acc) 


‘I picked several mushrooms in (*for) an hour. (C) 


These examples all designate an iterative event of picking mushrooms, with the du- 
ration of an hour. Because the unquantified partitive object in (41) designates a non- 
exhaustive quantity of mushrooms, the number of the sub-events (of picking one mush- 
room at a time) is likewise non-exhaustive (unbounded), and the accusative-marked du- 
rative adverbial tunnin ‘for an hour’ must be used to indicate the temporal boundaries of 
the event. In (42) the plural accusative (syntactically accusative, morphologically nomi- 
native) object indicates an exhaustive quantity of mushrooms, which yields a bounded 
number of the sub-events; hence the inessive tunnissa ‘in an hour’ must be used. Remark- 
ably, even though both the quantifier useita and the head sieniä ‘mushrooms’ in (43) are 
in the partitive, the example aligns with the accusative object in (42), not with the bare 
partitive in (41), by selecting the inessive durative element.! As Yli-Vakkuri (1979) points 


"However, it deserves to be pointed out that if the partitive marking of the object NP is triggered by 
non-culminating aspect alone, not by non-culminating aspect based on non-exhaustive quantity, then the 
phrases with a partitive quantifier align with partitive objects: Heikki rakast-i [nais-ta /*naise-n / nais-i-a 
/ *naise-t / use-i-ta nais-i-a} [Heikki love-pst.3sc woman-PAR / *woman-Acc /woman-PL-PAR / *woman- 
PL.NOM / several-PL-PAR woman-PL-PAR] “Heikki loved [a/the] woman /[o/the] women / several women’. 
Because the verb ‘love’ is atelic, the accusative object is ungrammatical, but both the unquantified partitive 
(singular or plural) and the plural partitive quantified by useita are fine. A more detailed analysis of the 
grammatical functions of phrases with plural partitive quantifiers must be left for future research. 
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out, the (syntactic) accusative object with the plural NP usea-t siene-t [several-PL.NOM 
mushroom-PL.NOM] would indicate a more specialized meaning, i.e. ‘several sets of mush- 
rooms”, e.g. for different mushroom dishes. Therefore, she argues, the case distribution 
(NOM/ACC vs. PAR) of quantified NPs differs from that of unquantified NPs. 

This is strong evidence for Yli-Vakkuri's (1979) point that the quantity indicated by 
an NP with a plurality quantifier is fundamentally different from the quantity indicated 
by an unquantified NP. The same can be said of examples such as (40), with a plural 
partitive of a numeral, which can only be formed of numerals divisible by ten (tens 
of’, ‘hundreds of”, ‘thousands of”, but not for instance “eights of”). In Finnish, such ex- 
pressions, when used in the function of a subject, alternate between the nominative 
(e.g. kymmene-t ihmise-t [ten-PL.NOM person-PL.NOM]) and the partitive (e.g. kymmen- 
i-à ihmis-i-à [ten-PL-PAR person-PL-PAR]), both of which can be translated into English 
as tens of people. The nominative version can mean either ‘ten sets of people’ [e.g. ten 
work teams] or, more vaguely, “several sets of (ten) people”, in which case the opposition 
between the partitive and the nominative is neutralized, as both expressions are vague 
as to how many such sets they refer to. 

When such a phrase is used as the subject of a transitive clause in Standard Finnish, 
it would be expected to be in the nominative. However, as the data of Yli-Vakkuri (1979) 
and this study suggest, in unedited texts at least, the plural partitive numeral is quite 
common and acceptable. According to Yli-Vakkuri (1979), one motivation for the expan- 
sion of the partitive in this construction is the fact that the nominative might imply a 
too specific interpretation for the quantified partitive noun (e.g. tens of the people”, or 
(specifically) “ten sets of people”), which is not intended. Thus the partitive quantifier 
may be gaining ground in uses where the nominative would indicate too specific mean- 
ings. As in example (43), the partitive plural numeral also indicates a definite quantity 
when used as the object in iterative expressions; consider (44). 


(44) Poim-i-n kymmen-i-à  sien-i-d tunni-ssa ("tunni-n). 
pick-Psr-sG ten-PL-PAR  mushroom-PL-PAR hour-INE  (“Acc) 


‘I picked tens of mushrooms in (*for) an hour’ (C) 


In semantic terms, the grammatical behavior of the phrases with partitive-marked 
quantifiers thus suggests that they designate a definite quantity. Like uninflected, fos- 
silized mass quantifiers such as paljon “a lot of” (38) or runsaasti 'abundantly' (36), plural 
partitive quantifiers suffice to quantify the partitive-marked noun. For instance, in (44) 
this means that there are an indefinite number of higher-order quantities that consist 
of ten mushrooms each. This, perhaps surprisingly, yields a bounded quantity of the 
mushrooms, even though the plural partitive kymmeniä ‘tens (of) would suggest that 
the number of such quantities (with ten mushrooms in each) is unbounded. One might 
in fact say the same of the English translation of (44): the expression tens of mushrooms 
literally indicates an indefinite number of quantities of ten mushrooms. Likewise in En- 
glish, though, the durative modifier must be of the type in an hour, not for an hour. In 
sum, there are good reasons to concur with Yli-Vakkuri's (1979) argument that the over- 
whelmingly most common kind of phrase used as a partitive A, that is, an NP with a 
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quantifying element preceding the partitive-marked noun, is fundamentally different 
from a bare partitive form in terms of quantification. 

It is also worth pointing out that if such a quantifier is added to one of the partitive NPs 
in the ambiguous example (29) “girlsrpar] followed boren), then a strong inclination 
arises to understand the quantified phrase as the subject, even though in principle it 
could still be the object as well. Consider the following examples. 


(45) Tyttó-j-à seuras-i use-i-ta poik-i-a. 
girl-PL-PAR  follow-Psr.3sG  several-PL-PAR  boy-PL-PAR 
The girls were followed by several boys' / ?? '[Some] girls followed several boys: 


(C) 


(46) Kymmen-i-à  tyttó-j-à seuras-i poik-ia. 
ten-PL-PAR  girl-PL-PAR  follow-Psr.3sG boy-PL-PAR 


‘Tens of girls followed the boys’. / ?? Tens of girls were followed by boys. (C) 


Furthermore, the quantifier paljon ‘a lot’ in fact makes this test sentence unambiguous, 
because it cannot quantify the object of an atelic verb (see Karttunen 1975), and thus the 
paljon phrase must be the subject in (47). 


(47) Tyttó-j-à seuras-i paljon ` poik-i-a 
girl-PL-PAR follow-pst.3sG  alotof boy-PL-PAR 


"Ihe girls were followed by a lot of boys? (C) 
Such effects disappear and the ambiguity returns if both phrases include a quantifier: 


(48) Kymmen-i-à  tyttó-j-à seuras-i use-i-ta poik-i-a. 
ten-PL-PAR girl-PL-PAR  follow-Psr.3sG  several-PL-PAR  boy-PL-PAR 
Tens of girls were followed by several boys. / "Tens of girls followed several 


boys. (C) 


However, such combinations seem to be extremely rare in actual usage. In Yli-Vakkuri 
data, there is not a single instance of the type illustrated by (48), and I have not been able 
to find such hits with my searches either. As most partitive A phrases include quanti- 
fiers, and most object phrases do not, this suggests that the system nevertheless rather 
successfully keeps the A and O grammatically apart in the majority of cases. 


6.2 The problematic monta ‘many[PAR?]’ 


Among the quantifying expressions commonly used in partitive A phrases, the form mon- 
ta [many-PAR] “many” has an especially interesting role (see also Huumo 2017). First of 
all, it is (historically) a singular partitive form of the quantifier moni “many”, and the 
element it quantifies is likewise in the singular partitive, not in the plural like most 
partitive A phrases. The nominative form moni modifies a singular nominative head, 
but it has a more specific (many of the”) type of meaning, e.g. moni mies [many.sc.NOM 
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man.sG.NOM], cf. the English many a man. For this quantifier, the form mon-ta, in spite of 
its partitive case, has been generalized to many uses where it has a function similar to the 
nominative form of cardinal numerals. In spite of this, the earlier literature on partitive 
A (until Branch 2001) has treated monta expressions as partitive phrases, without paying 
attention to their special nature. 

To grasp the idiosyncratic nature of monta phrases, consider first the use of cardinal 
numerals in Finnish. Finnish cardinal numerals in the nominative combine with a sin- 
gular partitive noun that indicates the quantified entity type, e.g. viisi mies-tá [five.nom 
man-SG.PAR] ‘five men’. In other case forms, however, the quantified noun and the nu- 
meral carry the same case. The numeral can also occur in the partitive if used for instance 
in the function of a partitive object;? consider example (49). 


(49) Heikki  rakasta-a kolme-a nais-ta. 
Heikki love-PRs.3s6 three-PAR woman-SG.PAR 


“Heikki loves three women: (C) 


If the numeral is in the nominative, it is analyzed as the head by grammars, and the 
quantified partitive form as a post-modifier (50). However, in other case forms the nu- 
meral agrees with the quantified noun (like an adjectival modifier), which is why the 
quantified noun is then considered the head; cf. example (51) where the possessor NP is 
marked with the adessive. 


(50) Viisi mies-tá saapu-i. 
five. NOM  man-SG.PAR  arrive-PST.3SG 
“Five men arrived. (C) 

(51) Viide-llà  miehe-llá on flunssa. 
five-ADE man-SG.ADE  is.PRs.3sG  flu.NoM 
“Five men have the flu. (C) 


As I pointed out above, the form monta, though morphologically a partitive, behaves 
in many contexts like the nominative (not partitive) form of a numeral (Branch 2001); 
consider (52) and (53). 


(52) Mon-ta mies-tá saapu-i. 
many-PAR man-SG.PAR  arrive-PST.3SG 
“Many men arrived. (C) 

(53) Viisi ("viit-tà)  mies-tà saapu-i. 
five.NoM (*PAR) man-SG.PAR  arrive-PST.3SG 


“Five men arrived? (C) 


12To indicate non-culminating aspect or negative polarity — note that the quantity indicated by the numeral 
phrase is ofthe exhaustive type, which is why the partitive marking cannot be motivated by non-exhaustive 
quantity. 
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It is in examples like (52) that the partitive form monta behaves like the nominative 
form of a numeral (53). In principle, the nominative moni mies [many.NOM man.NOM] 
would be expected, but as Yli-Vakkuri (1979) and Branch (2001) point out, it would easily 
be understood as meaning “many of the men’ [i.e. some members of a definite set] or the 
idiomatic ‘many a man’. Note that the subject NP in (52) is not functionally similar to 
a partitive subject proper, as singular count nouns cannot be used in this function (see 
examples (1)-(3)). Example (53) shows that numerals must take the nominative in such 
a context. 

Since mon-ta, in spite of its partitive ending, is functionally similar to the nominative 
of numerals, the pleonastic “double partitive” form mon-ta-a [many-PAR-PAR] has arisen 
to explicitly indicate the partitive meaning. Like the partitive of the numeral ‘five’ in (53), 
the form montaa would be ungrammatical in (52). Montaa is used in contexts where nu- 
merals are likewise in the partitive, e.g., in the functions of aspectually partitive-marked 
or negative-polarity partitive objects. It is in a grammatical opposition with the ^nomina- 
tivized" monta in contexts where aspect can alternatively be understood as culminating 
or not culminating; consider (54) (with a nominative numeral or monta) vs. (55) (with a 
partitive numeral or montaa). 


(54) Ole-n luke-nut ` mon-ta ^ kaksi kirja-a. 
have-PRssG  read-PTCP many-PAR ~  two.NOM  book-sG.PAR 


“T have read many - two books [completely]. (C) 


(55) Ole-n luke-nut mon-ta-a ^ kah-ta kirja-a. 
have-PeRs.isG  read-PRTC  many-PAR-PAR ~  two-PAR  book-SG.PAR 


‘I have read many ~ two books [not completely]’; ‘I have been reading many ~ 
two books: (C) 


In example (54), the form monta, like the nominative numeral kaksi ‘two’, indicates 
a culminating aspect: the books have been read completely. Functionally they thus re- 
semble the accusative object. In (55), on the other hand, the form montaa, as well as 
the partitive kahta, indicate that the reading is either ongoing or that it has not (yet) 
concerned the whole books. 

Until the mid-1990's, the pleonastic montaa was considered an error by language plan- 
ning authorities, but in 1995 it was accepted in contexts such as (55), where the partitivity 
needs to be explicitly indicated (Lansimaki 1995; Nyman 2000; Branch 2001). However, if 
the aspect is unambiguously of the non-culminating type, then even monta can still have 
the function similar to that of a partitive numeral (56); cf. (57) with a numeral proper. 


12In a Google search (13.11.2014), the string rakastaa mon-ta-a [love.pRs.3sG many-PAR-PAR] produced over 
5000 hits, while rakastaa mon-ta [many-PAR] produced slightly more than 1000 hits. Though such numbers 
must be taken with great caution, this might suggest that in Internet language, the double partitive is more 
common (as expected), but both forms are nevertheless used in the function of the partitive object of the 
atelic verb rakastaa ‘love’ (which does not take an accusative object outside some resultative constructions 
such as “She loved him crazy"). 
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(56) Eemeli  rakasta-a mon-ta(-a) nais-ta. 
Eemeli love-PRs.356 many-PAR(-PAR)  woman-sG.PAR 


Eemeli loves many women. 


(57) Eemeli ` rakasta-a kah-ta ("kaksi) nais-ta. 
Eemeli love-PRs.38G6 two-PAR ("NOM)  woman-SG.PAR 


‘Eemeli loves two women? 


In (56), both monta and montaa are fine in the function of the partitive object of the 
atelic verb rakastaa. This shows that monta has not completely lost its ability to be a 
functional partitive, if the context unambiguously assigns such a function to it. Example 
(57) shows that the nominative form of the numeral kaksi 'two' is not possible in this 
context. 

What relates this lengthy discussion of monta with the partitive A is the fact that 
monta phrases quite frequently occur as transitive subjects, as in examples (58) and (59) 
below. 


(58) Minu-a  katsel-i mon-ta utelias-ta silmá-à. 
1SG-PAR  watch-PST.35G  many-PAR  curious-SG.PAR  eye-SG.PAR 


‘I was watched by many curious eyes. 


(59) Mon-ta sukupolve-a rakens-i kirkko-a 
many-PAR  generation-sG.PAR  build-Psr.3sG  church-PAR 
(näkemättä sitä valmiina) 
‘Many generations were (= participated in) building the church (without seeing it 


finished). 


Branch (2001) reports that such uses of monta phrases in the function of A were already 
discussed by linguists at the end of the 19th century, which shows that its reanalysis as a 
nominative may have been going on for a relatively long time. Such a quantifier which is 
formally a partitive but functionally a nominative is probably another factor paving the 
road for quantified partitive phrases to spread into the function of A. Because monta is 
functionally a nominative, I do not consider examples such as (58) and (59) as instances 
of the partitive A proper. However, their existence must be taken into account as a factor 
supporting the partitive A. 

The constraint discussed in 83, stating that the clause with a partitive A cannot denote 
a collective accomplishment, seems to hold for monta subjects as well. Thus (60), with 
its accusative object, is understood in the distributive sense where monta ‘many’ has a 
wide scope over the indefinite object “house”, i.e. that each person has built their own 
house, whereas (61), with the nominative numeral sata hundred” has both a collective 
and a distributive interpretation. 


(60) Mon-ta ihmis-tà on rakenta-nut talo-n. 
many-PAR  person-sG.PAR  have.PRs.3sG  build-PRrTC  house-Acc 


> 


“Many people have built a house [each their own] 
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(61) Sata ihmis-tà on rakenta-nut talo-n. 
hundred  person-sc.PAR  have.PRs.3sG  build-PRrTC  house-Acc 
‘A hundred people have built a/the house [together or each their own]. (C) 

The pleonastic partitive montaa, like partitive forms of (singular) numerals, cannot 
occur in the function of the partitive A. Because it is a singular partitive form, its use in 
existentials is restricted to contexts where it quantifies a mass noun, which must then be 
understood in a special sense (‘many kinds of a substance”); cf. (62). In contrast, the forms 
with monta, as well as nominative numerals, are quite typical in existential S argument 
NPs (63). 


(62) Tá-ssá on mon-ta-a ^  viit-tà kahvi-a. 
here-cINE  be.PRs.386  many-PAR-PAR ~ five-PAR  coffee-sG.PAR 


“Here is coffee of many ~ five kinds. (C) 


(63)  Tá-ssá on mon-ta ^  viisi kahvi-a. 
here-CINE  be.PRs.386 many-PAR ~  five.NOM  coffee-SG.PAR 


“Here are many - five [portions of] coffee’ (C) 


Summing up, in addition to the infinitival constructions discussed in 85, different 
quantifier phrases “pave the road” for the partitive-marked NP to spread into transitive 
clauses. A special case of this is the quantifier monta “many”, which is formally a singular 
partitive but has the function of a nominative numeral. However, other quantifying ex- 
pressions in the plural likewise serve as an analogy to the transitive constructions with 
the partitive A. 


6.3 Quantifiers: interim summary 


Quantifying expressions turned out to be common in the occurrences of the partitive 
A I collected for this study, which suggests that they may play an important role in the 
spread of partitive NPs into the function of A. The study has demonstrated that both mass 
(‘a lot of”) and plurality (‘several’) types of quantifiers are in use. In more general terms, 
Finnish partitive NPs with quantifiers seem to have an intermediate status between nom- 
inative phrases indicating exhaustive quantification and (unquantified) partitive phrases 
indicating non-exhaustive quantification. This is clearest if we consider the use of such 
phrases as grammatical objects (cf. 86.1): partitive NPs with quantifiers behave like ac- 
cusative (not partitive) objects with respect to the modification of duration by selecting 
durative modifiers of the type ‘in an hour’ (inessive-marked in Finnish). On the other 
hand, the nominative forms of many plurality quantifiers have acquired more specific 
quantificational meanings (e.g. ‘many of the’ or “several sets of”) which clearly restrict 
their use and make the partitive the unmarked option in many contexts. 

The partitive-marked quantifier that has developed furthest in this direction is monta 
(‘many’), which functionally behaves like a nominative of a cardinal numeral. However, 
other partitive-marked (plurality) quantifiers may be following this path by replacing 
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the nominative in some contexts. When taking on these functions typical of nominative 
(or accusative, in object marking) NPs, the quantified partitive phrases themselves un- 
dergo a functional transition and become more similar to nominative/accusative than 
unquantified partitive NPs. 


7 Conclusions 


As has become evident in this study, it is difficult to obtain data of the partitive A, which 
seems to be a rare phenomenon in general, and occurs most typically in registers of 
unedited written language. Though considered an error by language planning authori- 
ties, the partitive A is used at least occasionally, and the examples I have collected, as 
well as those analyzed by Yli-Vakkuri (1979), do not sound ungrammatical to the native 
speaker's ear. It seems that the uses of the partitive A concentrate around atelic expres- 
sions of low transitivity. This semantic feature partially explains why the object NP is 
also in the partitive in most cases. Accusative objects seem to be in minority, and ifused, 
they are understood in a distributive sense where each referent of the partitive A (which 
is practically always in the plural) performs the activity individually. The partitive A 
seems to be clearly ungrammatical with the accusative object indicating a collective ac- 
complishment. 

I have also proposed that there are some grammatical subsystems and constructions 
that, figuratively speaking, pave the road for the partitive marking to spread into the 
subject of transitive clauses: 1) decay of verb agreement in clauses with an indefinite, 
clause-final plural subject (cf. also De Smit 2016); 2) constructions that combine an intran- 
sitive finite verb with a transitive infinitive "adverbial", such as the progressive “be doing' 
construction, and 3) the system of quantifying expressions where even partitive-marked 
quantifiers such as use-i-ta [several-PL-PAR] ‘several’ or sato-j-a [hundred-PL-PAR] “hun- 
dreds of' indicate a definite quantity. This supports Yli-Vakkuri's (1979) argument that a 
typical partitive A is not quantitatively non-exhaustive in the way a bare partitive sub- 
ject is. Furthermore, the nominative forms of these quantifying expressions, which have 
been recommended by language planning authorities to be used instead of the partitive, 
have gained narrower definite meanings and thus might evoke implications the speaker 
does not wish to convey. If such semantic oppositions conventionalize, then the partitive 
form of such quantifiers may be developing into an unmarked indicator of an indefinite 
subject. 

In sum, the observations suggest that there is a pressure to mark indefinite plural sub- 
jects with the partitive not only in existential clauses (which are intransitive) but also 
in some transitive clauses, i.e. those that indicate an aspectually non-culminating, low- 
transitivity event. If existential clauses are considered a subtype of intransitive clauses,” 
then it can be generalized that among intransitive clauses the partitive marking con- 
cerns S arguments that are indefinite and indicate non-exhaustive quantification of a 
discourse-new referent (a substance or a multiplicity). Such an option has been missing 


Tn the Finnish tradition, existentials are usually treated apart from both intransitive and transitive clauses, 
which share many features such as the nominative subject, SV/AV word order, and subject-verb agreement. 
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from the marking of the A argument in Standard Finnish, even though A arguments can 
likewise indicate discourse-new multiplicities (as the English indefinite plural in Several 
bystanders witnessed the accident). This may result in an analogical motivation for a sim- 
ilar system of case oppositions to arise in the marking of A arguments (cf. SerZant 2013: 
336—338). 

The Finnish partitive A fulfills the definition of differential argument (subject) mark- 
ing presented by Witzlack-Makarevich & Seržant (2018 [this volume]). Their broad defini- 
tion (cf. also Woolford 2008) states that DAM is "any kind of situation where an argument 
of a predicate bearing the same generalized semantic role (or macrorole) may be coded 
in different ways, depending on factors other than the argument role itself". The narrow 
definition they provide states that DAM is “any kind of situation where an argument of a 
predicate bearing the same generalized semantic role (or macrorole) may be coded in dif- 
ferent ways, depending on factors other than the argument role itself and/or the clausal 
properties of the predicate such as polarity, TAM, embeddedness, etc? The Finnish parti- 
tive A (and obviously also partitive S) seems to fit both definitions. The partitive marking 
ofthe S argument, and (as the data discussed in the present paper show) sometimes even 
the A argument, typically concerns plural forms that are indefinite in two ways (as al- 
ready argued by Siro 1957): 1) in the notional sense (= they have a discourse-new referent) 
and 2) in the quantitative sense (- they indicate a non-exhaustive quantity). However, 
since the presence of a quantifier, which is often partitive-marked itself, seems to be com- 
mon in NPs with the function of the partitive A, feature 2 seems to concern only a minor- 
ity of the instances. Considering the potential motivations for a DAM system listed by 
Witzlack-Makarevich 8 Serzant (2018), the Finnish Partitive A includes features of both 
argument-triggered DAM (it concerns indefinite discourse-new plurals) and predicate- 
triggered DAM (it concerns certain low-transitivity verbs, especially verbs of perception 
as well as verbs that indicate a locative arrangement such as ‘follow’ or surround”). 

Occasional uses of the partitive as a marker of the transitive subject have been pointed 
out in the literature for over a hundred years. In lack of statistical data and a comparable 
set of unedited written language from an earlier era, it is difficult to say whether this 
indicates an ongoing change in the marking of the transitive subject. However, as De 
Smit's (2016) analysis demonstrates, the nominative has been in use in Old Finnish as 
the case of plural existential S arguments which would take the partitive in present-day 
Finnish. This suggests that the partitive has been expanding as a marker of the existential 
S in intransitive clauses during the last few centuries, and there may thus be a tendency 
to continue its expansion into transitive clauses to mark plural indefinite subjects as 
well. 
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Abbreviations 
1 first person INE X inessive 
second person INF infinitive 
third person NEG negation, negative 
ACC accusative NOM nominative 
ADE  adessive PAR  partitive 
ALL  allative PL plural 
DAT dative POSS possessive 
DOM differential object marking PRS present 
ELA  elative PTCP participle 
GEN genitive PST past 
ILL illative SG singular 
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languages 
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Canonical DOM is rather uncommon in the Saami languages (Uralic), and the only clear 
instances of this are attested in South Saami where definiteness does determine the coding 
of objects in the plural. On the other hand, the coding of experiencer verbs (e.g., ‘like’, “care” 
and fear”) displays variation in this regard across Saami languages. With the North Saami 
verb liikot like', for example, the stimulus may appear in the illative, genitive-accusative 
and locative cases without any major difference in meaning. This has usually been viewed 
as unwelcome influence from the majority languages (Norwegian, Swedish and Finnish). 
In this paper, however, we will argue that it is no coincidence that the variation concerns 
mainly experiencer verbs, and more specifically, we will show that the attested variation 
can be seen as an uncanonical instance of DOM. First of all, the variation in the coding is 
not semantically determined in the sense that it does not affect the semantic roles of the 
relevant arguments, which is typical of canonical DOM as well. Second, differently from 
canonical instances of DOM, the variation concerns semantic cases instead of structural 
cases, and the variation is between two non-zero cases, while canonical DOM is between 
zero and non-zero case. Third, the conditioning factors are different from canonical DOM, 
since animacy and definiteness do not contribute to the discussed variation in any direct 
way. Language contact does play an important role in this process, but the pursuit of coher- 
ence, the semantic emptiness of the cases, and features of semantic transitivity also make a 
significant contribution to the variation. 


1 Introduction 


As is typical of all languages discussed in this volume, instances of canonical differential 
object marking are attested also in the Saami languages, as illustrated below: 


Seppo Kittilà & Jussi Ylikoski. Some like it transitive: Remarks on verbs of liking and 
the like in the Saami languages. In Ilja A. SerZant & Alena Witzlack-Makarevich (eds.), 
| Diachrony of differential argument marking, 413-436. Berlin: Language Science Press. 
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(1) South Saami (Uralic; Bergsland 1994: 60) 


a. Laara  treavkah dorjeme. 
L. ski.PL(.NoM) make.PST.PTCP 


‘Laara has made (a pair of) skis. 


b. Dejtie treavkide vööjnim. 
jt.PLACC ski.PL.ACC see.PST.1SG 


‘I saw the skis? 


As the above examples show, indefinite objects in South Saami bear (zero) nominative 
coding (1a), while definite objects appear in the accusative (1b). The examples in (1) thus 
constitute a typical instance of DOM, as the notion is typically understood, even though 
it should be noted that the variation illustrated in (1) is limited to the plural while canon- 
ical DOM concerns also objects in the singular. Moreover, differently from many other 
languages with DOM, animacy appears to play no role for DOM in South Saami. Instead, 
the variation in object coding in (1) is determined solely by definiteness.! This is in line 
with the object coding in other Uralic languages (see, e.g., Virtanen 2015 for a discussion 
of Eastern Mansi). 

Even though the examples above can be viewed as a canonical instance of DOM, they 
do not constitute the most widespread type of variation in the object coding in the Saami 
languages. Quite the opposite, canonical DOM in the form illustrated in (1) seems to be 
limited to South Saami only. Much more common across the Saami languages is the kind 
of variation illustrated in (2) from North Saami: 


(2) North Saami (Uralic; personal knowledge) 
Vástit gažaldaga ~ gažaldahkii! 
answerIMP.28G question.GENACC ~  question.ILL 


‘Answer the question! 


(3) North Saami (Uralic; personal knowledge) 
Itgo liiko gažaldaga ~ gažaldahkii ~ gazaldagas? 
NEG.2SG.Q like.cNG  question.cENACC ~ question. ILL ~ question.LOC 
‘Don’t you like the question?’ 
As shown above, the object may appear in the (genitive-)accusative, illative and also 


locative case without any major changes in semantics.? In other words, both construc- 
tions in (2) mean ‘Answer the question!”, and the three alternatives in (3) pose the same 


lAs pointed out by Sieg] (2012: 208), however, the nominative/accusative DOM in South Saami has not 
been studied thoroughly. Furthermore, the contemporary object marking seems to differ from that of the 
language system depicted by earlier grammarians. 

?In the Saami grammatical tradition, only the (genitive-)accusative (and South Saami nominative plural) are 
regarded as object cases. Semantic cases such as the illative and the locative in analogous functions are 
usually characterized as adverbials. For the purposes of the present paper, all non-nominative arguments 
of the type seen in (2)-(3) are regarded as objects in the sense that they are not subjects and they are 
parts of the valence of the verbs in question, as it would be somewhat awkward to label freely alternating 
arguments as either (accusative) objects or (illative, locative or elative) adverbials on the basis of their 
external appearance only. 
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question as to whether the hearer likes the question or not. It is also noteworthy that nei- 
ther definiteness nor animacy contribute to the attested variation. As for (3), it should 
be noted that the kind of variation exemplified here is one of the favorite eyesores of 
Saami language purists, because usually only one of the three alternatives is deemed 
good North Saami. The variation illustrated above is not limited to North Saami: as the 
discussion in this paper will show, it is attested in other Saami languages as well. The 
variation seems to be most common for experiencer verbs (3), which will be the focus of 
our study. It is, however, important to note that similar phenomena can to some extent 
be observed for other verbs, such as vástidit ‘answer’ in (2). 

There are numerous studies dealing with DOM from different perspectives, as the 
chapters of this volume also very well show. Most of these studies are characterized 
by two important common features. First, the variation is between two structural cases, 
usually a zero-marked nominative (or absolutive) and an explicitly marked accusative 
(or accusative-dative) case. Second, the great majority of DOM studies restrict the no- 
tion to cases where variation in object coding is determined by animacy, definiteness 
or topicality on the coding of objects. This paper also adds an entry to the already long 
list of DOM studies, but the type of DOM examined here is clearly different from that 
usually discussed. First, the typical DOM triggers, namely animacy and definiteness (or 
topicality), play no role in object coding. Second, the variation is often between seman- 
tic cases (e.g., locative and illative), although the accusative partakes in the variation as 
well. Third, we are dealing with a clear instance of lexically restricted predicate-triggered 
DOM, in Saami languages attested primarily (yet not exclusively) for experiencer verbs. 

Despite the evident differences from typical DOM studies, the variation examined in 
this paper also resembles canonical DOM in certain respects. The semantic roles of the 
differently coded objects do not vary in the cases discussed, which can be claimed to 
be true of canonical DOM as well; in the cases discussed in this paper, the role of the 
differently coded objects, regardless of their coding, is that of a stimulus. We hope that 
our study will broaden our perspectives on DOM and help to identify similar phenomena 
in other languages as well. The number of examples discussed in this paper and attested 
in Saami languages is not very high, but they nevertheless provide us with clear clues as 
to what kind of variation we are dealing with. 

As noted above, the instances of DOM discussed in this paper differ from the type 
typically discussed under DOM. The topic is also rather novel to Saami linguistics, where 
the type of variation illustrated in (2)-(3) is usually understood as unwanted interference 
from majority languages or language decay (cf., e.g., Vuolab-Lohi 2007: 425; Olthuis 2009: 
86-87). In this paper, the problem is approached from a more general perspective. The 
main features considered here are the effects of language contact, emptiness of semantic 
cases and tendency towards coherence in marking. In other words, we will show that the 
variation is not random and not necessarily a result of language decay following from 
language contact, as is often the view of language purists, but that it can also be given a 
valid language-internal explanation. Moreover, it is not a coincidence that the variation 
concerns experiencer verbs and not, for example, highly transitive verbs. It is, however, 
important to note that DOM is still a rather limited phenomenon in the Saami languages. 
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It is attested mainly with experiencer verbs, and moreover, it applies only to a small set 
of these verbs. Despite this, we hope that our paper provides new insights into DOM. 

The discussion in this paper is based on the six most widely spoken Saami languages, 
which are described in the following section. We illustrate and discuss examples of many 
different experiencer verbs. However, it is not our goal to give a systematic overview 
of the variation between different verbs; instead, the variety of verbs serves only the 
purpose of illustrating the nature and limits of the variation under discussion. 

The organization of the paper is as follows. $2 discusses the examined Saami languages 
and their basic argument marking patterns as they are relevant to the discussion in this 
paper. 83 presents a set of concrete examples of the differential coding of experiencer 
verbs in Saami languages. In 84, the main theoretical implications of the paper are dis- 
cussed.? 


2 Ihe Saami languages and argument marking 


The Saami branch of the Uralic language family consists of a chain of closely related 
languages whose territory extends from the central parts of Norway and Sweden up to 
the Kola Peninsula of northwest Russia. Of the nine or ten living Saami languages, seven 
have official literary standards and six of them have several hundred or even thousands 
of speakers each. The discussion in this paper focuses on data from the following six lan- 
guages with the most speakers and widest literary use: South Saami (Norway, Sweden), 
Lule Saami (Norway, Sweden), North Saami (Norway, Sweden, Finland), Aanaar (Inari) 
Saami (Finland), Skolt Saami (Finland, Russia), and Kildin Saami (Russia). Our data is 
either drawn from or otherwise based on the literary use of the present-day languages. 
As the total number of speakers of the Saami languages is less than thirty thousand, all 
Saami languages are minority languages except in two Norwegian municipalities, where 
North Saami is the majority language of the local communities. As a consequence, vir- 
tually all present-day speakers of Saami languages are bi- or trilingual to some extent, 
and this naturally affects the minority languages in many ways — argument marking not 
being an exception. 

Not unlike in the other Uralic languages, the morphosyntax of the Saami languages 
is largely based on the interplay of morphological cases. All Saami languages are ex- 
plicitly nominative-accusative languages, with the zero-marked nominative for subject 
arguments and descendants of the Proto-Saami (and ultimately Proto-Uralic) accusative 
for direct objects. However, the picture is partly blurred by the fact that in the Saami 
languages east of Lule Saami (including North, Aanaar, Skolt and Kildin Saami), both 
the original accusative (*-m) and genitive (*-n) singular case suffixes have been lost, and 
the two cases have merged into one, the genitive-accusative case, which for most nouns 
differs from the nominative only by stem-internal differences. Furthermore, the same 


5We wish to thank the editors, an anonymous reviewer and Nils Øivind Helander for a number of valuable 
comments on earlier versions of this paper. We also thank Tiina Sanila-Aikio and Eino Koponen for help 
and discussions on Skolt Saami and Elisabeth Scheller for corresponding help with Kildin Saami. 
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language border - between Lule Saami in the west and North Saami in the east - wit- 
nesses the merger of two local cases in the west: the descendants of the Proto-Saami 
inessive (‘at’) and elative (‘from’) survive in the so-called locative case of the eastern- 
most languages.’ On the other hand, the genitive-accusative merger is total — affecting 
both singular and plural forms - in North Saami only, but not in the easternmost lan- 
guages (including Aanaar, Skolt and Kildin Saami), which have retained the distinction 
in the plural? 


Table 1: The South, Lule and North Saami case systems exemplified with the 
words for ‘fish’ 


South Saami Lule Saami North Saami 

Singular Plural Singular Plural Singular Plural 
Nominative guelie guelieh guolle guole guolli guolit Nominative 
Accusative gueliem  guelide guolev guolijt guoli guliid Genitive- 
Genitive (‘of’) guelien  gueliej guole guolij accusative 
Illative (‘to’) gualan guelide guollaj ^ guolijda  guollái  guliide Illative 
Inessive (‘at’) guelesne ` gueline guolen guolijn guolis guliin Locative 
Elative (‘from’) gueleste ` guelijste guoles guolijs Cat; from’) 
Comitative (‘with’) gueline ^ gueliejgujmie guolijn ^ ^ guolij guliin guliiguin  Comitative 
Essive ('as”) gueline guollen guollin Essive 


Individual Saami languages also exhibit various degrees of syncretism within plural 
case forms and between the plural inessive/locative and singular comitative, for example. 
In addition, some of the languages make use of additional cases or regressing case-like 
adverbs labeled as abessives and partitives, but as their functions fall outside the imme- 
diate scope of the present paper, they will be omitted in the following description of 
argument marking in Saami. (For a more comprehensive description of the Saami case 
markers and their syncretism, see, e.g., Sammallahti 1998: 65-71; Hansson 2007.) For the 
purposes of the present paper, the common core of the Saami case morphology is pre- 
sented in Table 1, which exemplifies the case systems in South, Lule and North Saami. 

As regards phenomena that can be labeled as differential argument marking, as many 
as five of the six to eight cases in Table 1 are involved: nominative, (genitive-)accusative, 


^The Proto-Saami inessive (*-sna) and elative (*-sta) are cognate with their namesakes in Finnic languages 
such as Finnish and Estonian (see, e.g., Sammallahti 1998: 66-71, 203). However, in the absence of the so- 
called external local cases characteristic of Finnic, the Saami cases are also used in most of the functions of 
the Finnic cases adessive and ablative. As a consequence, the single locative cases in languages like North, 
Aanaar and Skolt Saami (all spoken in Finland) as well as Kildin Saami roughly correspond to as many as 
four local cases in Finnish. In the same vein, the Saami illative (*-sen) is cognate with the Finnic illative 
(*-sen), but is also a functional equivalent of the Finnic allative. 

?When speaking of the “easternmost Saami languages”, we are not taking a stance on whether or not the 
Saami branch must be strictly divided to two - Western Saami and Eastern Saami with capital letters — 
along the phonologically significant, but lexically less decisive border between North Saami and Aanaar 
Saami. For a comprehensive discussion of these issues, see Rydving (2013). 
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illative, locative (elative) and comitative.Ó Of special interest here is the use of the local 
cases illative and locative as argument markers whose functions can hardly be distin- 
guished from the direct objects marked with the (genitive-)accusative (and the South 
Saami plural nominative). Although this paper pays special attention to the internal 
and mutual variation in Saami argument marking, it is worth emphasizing that in spite 
of considerable phonological, morphological and lexical variation that makes even the 
closest Saami languages mutually unintelligible, their morphosyntactic structures are es- 
sentially quite similar. In a nutshell, the variation between individual Saami languages 
is comparable to the variation within the Germanic languages, for example. 

The basics of Saami argument marking can be seen in the following examples from 
South Saami: 


(4) 


(5) 


(6) 


(7) 


South Saami (Uralic; SIKOR) 


Gosse aktem  gámmam gaavnedigan akte dejstie guaktijste 
when one.acc woman.acc find.PST.3DU one  itPLELA couple.PL.ELA 
laejpieh öösti jih dejtie varki byöpmedi. 


bread.PL(NOM) buy.PsT.35G and it.PL.acc quickly eat.PST.3SG 


‘When they (two) found a woman, one of them bought some loaves of bread and 
ate them quickly: 


South Saami (Uralic; SIKOR) 
Daelie die  riektes aaksjoem  beekti Dh | álmese  vedti. 
now then real axe.ACC — bring.Psr.3s6 and man.ILL give.PST.3sG 


“Then he brought a real axe and gave it to the man. 


South Saami (Uralic; SIKOR) 
Mánnoeh  dutnjien jeehkimen, Lejsa. 
1DU 2SG.ILL | trustiDU ` Leejsa 


“We (two) trust you, Leejsa. 
South Saami (Uralic; SIKOR) 


Edtjem manne  datneste billedh ` juktie im datnem ` lyjhkh? 
shallisG se 2SG.ELA  fearINF because NEG.1SG  2SG.ACC |like.cNG 


“Am I supposed to fear you if I don't like you?” 


Explicit subject NPs (present in (6) and (7) are frequently omitted, as the subject par- 
ticipant can often be inferred from the context and the form of the finite verb. As for 
patients, themes or stimuli of various actions and events such as finding (4), buying (4), 
eating (4), bringing (5), and liking (7), the object is most often in the accusative. However, 


In addition to the genitival functions of the genitive(-accusative) and the spatial (‘at’) semantics of the 
inessive/locative, the functions of the essive case are not directly relevant for the present discussion, al- 
though the interplay between nominative- and essive-marked arguments and secondary predicates could 
also be regarded as differential argument marking in the broad sense (cf. Siegl 2017; Ylikoski 2017, and 
Witzlack-Makarevich & Serzant 2018 [this volume]). 
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as was already briefly mentioned in the introduction, for plural objects such as loaves 
of bread, a somewhat classical example of differential object marking is available: ac- 
cusative plural is used to refer to definite objects, whereas less definite objects may be 
expressed by the nominative plural. Thus, in (4) buying loaves of bread’ and “eating them 
(= the loaves)' are expressed by the nominative and accusative, respectively. In singular, 
such objects are always marked by the accusative only. Although this underdescribed 
phenomenon seen in (1) and (4) would merit a separate study (see, e.g., Wickman 1955: 
30-36; Magga & Mattsson Magga 2012: 184-186), it is enough to state here that South 
Saami seems to be the only Saami language exhibiting differential nominative/accusative 
object marking (see also example (1) above). 

For the purposes of the present paper, however, it is important to note that some verbs 
take their arguments in other cases, too, such as the local cases illative (‘to’) and elative 
(‘from’). To begin with, the illatives of all Saami languages could actually also be labeled 
as datives, as in addition to their spatial meaning (‘to’), the illatives are the default case 
for marking recipients such as the man to whom an axe is given in (5). Moreover, the 
illative-marked noun phrase of (6) as an argument of the verb jaehkedh ‘believe, trust’ 
is also reminiscent of so-called dative objects in cases such as ich glaube/vetraue dir ‘I 
believe/trust you' in German, as the illatives of Saami languages share many functions 
with the dative in German. Example (7) presents the second person singular pronoun 
datne in two case forms: datnem as the accusative object of the verb lyjhkedh ‘like’, but 
datneste as the elative (‘from’) complement of the verb billedh ‘fear’. While the choice 
of cases like the ones seen here may have historical and metaphorical connections to 
the concrete spatial meanings of local cases, from a strictly synchronic perspective we 
are often dealing with verbs whose argument structures seem to require the use of the 
elative instead of the default accusative case used for most verbs (billedh ‘fear’) or vice 
versa (lyjhkedh like”. 

Many Saami languages show considerable variation as to which cases are used for 
marking arguments of verbs such as the experiencer verbs for ‘trust’, ‘fear’ and ‘like’ 
as seen in (6)-(7). While the nominative-accusative differential object marking seen in 
(4) has a clear semantic function, it is more difficult to recognize possible semantic dif- 
ferences behind what seems to be more arbitrary variation in Saami argument marking. 
As it turns out, however, an important source of the variation to be described in the 
following section seems to be the sociopolitical environment of the Saami languages: in 
spite of relatively uniform morphosyntax, the present-day Saami languages are mostly 
used by bilinguals whose other languages include nation-state languages as divergent 
as Norwegian, Swedish, Finnish and Russian. While the Scandinavian (Norwegian and 
Swedish) influence on Saami syntax is rather uniform, Russian is quite different, and 
Finnish belongs to the altogether different stock of Uralic languages. 
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3 Data: experiencer verbs and their coding in Saami (with 
a special focus on “like”) 


In this section, the linguistic coding of experiencer verbs across languages and in Saami 
languages will be discussed. After briefly commenting on the coding of experiencer verbs 
from a cross-linguistic perspective, we present some ofthe semantic features that explain 
their less transitive coding. The section is devoted to the examination of Saami data, 
especially focusing on the argument marking of verbs that denote positive emotions 
such as liking, loving and caring (see also $4.1 further below). 

It is received wisdom in linguistics that the coding of experiencer verbs often devi- 
ates from the basic transitive pattern of a given language; for example, dative coding 
of the subject is common with experiencer verbs (see, e.g., Verma 8 Mohanan 1990 and 
Aikhenvald et al. 2001). These formal differences from basic transitive constructions of 
a given language are not random, instead following from the different semantic role as- 
signment of experiencing; experiencer verbs do not involve an agent and a patient, but 
an experiencer and a stimulus instead. Neither the experiencer nor the stimulus is nec- 
essarily affected, while in typical transitive events, the patient must be affected in order 
to constitute a true patient. It is, however, important to note that experiencer verbs do 
not constitute a semantically coherent verb class, but there are clear differences in their 
nature, which is also reflected in their coding. First, for example, in Finnish, the partitive 
(with verbs like ‘love’ and ‘hate’), elative (‘like’), illative (‘get bored with’), allative (‘get 
mad at’) and also accusative (‘see’, ‘hear’) can appear with experiencer verbs. Second, 
different classes of experiencer verbs differ according to whether the stimulus or the 
experiencer surfaces as the subject. 

With the Finnish verbs noted above, the subject refers to the experiencer, while the 
differently coded second argument codes the stimulus. However, there are other verbs, 
such as miellyttää ‘please’, or pelottaa ‘scare’, where the stimulus surfaces as the subject, 
and the partitively coded object refers to the experiencer. In the same vein, in Saami 
languages such as North Saami, verbs like balddihit ‘scare’ code the stimulus as a nomi- 
native subject and the experiencer as a genitive-accusative object. Finally, there are also 
verbs such as Finnish iloita ‘rejoice’, where the (elatively coded) stimulus can be seen as a 
kind of optional oblique that can be left out if the reason for rejoicing is not contextually 
relevant. Again, the same can be said about Saami verbs like North Saami illudit ‘rejoice’ 
(cognate of Finnish iloita), which will be discussed further below. Consequently, it is 
rather hard to make any cross-linguistic generalizations about the coding of experiencer 
verbs apart from the fact they typically somehow deviate from basic transitive construc- 
tions. In this paper, the focus is exclusively on experiencer verbs that code the stimulus 
as the (direct) object. This is very well in line with the goals of the paper, which is to 
show that there is a kind of differential marking for the objects of experiencer verbs. 
Taking other types of experiencer verbs into account may distort the results, because 
the attested variation follows from features that are not relevant to the discussion in 
this paper. 
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Argument marking of experiencer verbs has received almost no attention in Saami lin- 
guistics per se. Except for North Saami, the major Saami language that is spoken by about 
90% of all speakers of the Saami languages, grammatical descriptions of most Saami lan- 
guages contain only little information about argument marking. The general pattern 
of the existing school grammars (e.g., Spiik 1989; Olthuis 2000; Moshnikoff et al. 2009; 
Magga & Mattsson Magga 2012) is to state that the object is marked by the accusative 
case, whereas most other cases function as adverbials. The latter functions are described 
quite sporadically and impressionistically, though. For example, descriptions of South 
Saami characterize the use of elative in clauses like (7) as adverbials of cause, whereas 
some other complement-like elatives have been labeled as partial objects (Bergsland 1994: 
60-61, 72; Magga & Mattsson Magga 2012: 186). On the other hand, the identical behav- 
ior of the Lule Saami elative with verbs like ballat ‘fear’ (cognate of South Saami billedh 
seen in (7)) is explained as part of a larger whole, wherein verbs of fearing are said to co- 
occur with the object of fear marked by the elative (Spiik 1989: 98). Further still, Nickel 
and Nickel & Sammallahti (2011: 233, 236, 529—530) describe the analogous use of the 
North Saami ballat 'fear' as an example of verbs that come close to being transitive but 
govern the locative case instead. However, none of the grammars or other descriptions 
of Saami syntax have paid significant attention to possible semantic reasons for not us- 
ing the accusative for all object-like arguments. Additionally, little attention has been 
paid to the fact that in actual use, many verbs show variation in how the non-subject 
arguments are coded. The most remarkable exception in this respect is Helander's (2001: 
134-143) study of the North Saami illative in which he briefly examines the argument 
structure of the emotion verbs áibbasit ‘miss, yearn’, dorvvastit “count on, rely on’, duh- 
tat ‘settle for’, jáhkkit “believe”, liikot ‘like’, luohttit ‘trust’, oskut “believe, have faith’ and 
suhttat “get angry”, some of which also show variation between illative arguments and 
other cases as well as postpositions in their coding of the stimuli. The list could be con- 
tinued with verbs like dolkat ‘get fed up’, which takes either the illative or the locative, 
or illudit ‘rejoice; celebrate’ and heahpanit ‘be ashamed of” with even more variation to 
be discussed further below. Any comparative studies that would focus on these issues 
and cover more than one Saami language do not exist, however. 

In the following, such variation in argument marking will be described and discussed 
by examining the use of experiencer verbs denoting liking in six Saami languages. This 
particular group of verbs shows both language-internal and cross-Saami variation, which 
makes it suitable for providing novel contributions to our understanding of less typical 
instances of DOM. Due to the deficiencies and often prescriptive attitudes of existing 
grammatical descriptions most of the data is drawn from authentic (in part translated) 
texts made available by the SIKOR corpus at UiT The Arctic University of Norway. Al- 
though much of our understanding of South, Lule, North and Aanaar Saami is backed up 
by comparatively large corpora, this study is predominantly qualitative.’ As for our un- 


TWith respect to the size of the language communities, the available corpora are quite large. With 21.1 million 
words for North Saami, 0.8M for Lule Saami and 0.7M for South Saami, they contain approximately one 
thousand words per one speaker of the languages. As for the 1.3M words for Aanaar Saami, with about 400 
speakers, the ratio is even higher. 
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derstanding of the severely endangered Skolt Saami and Kildin Saami, our observations 
are more dependent on second-hand sources and elicited information from native and 
second-language speakers. 

The first verb to be examined is the South Saami lyjhkedh like”, an apparently recent 
loan from Scandinavian languages where especially the Norwegian like (and to lesser 
extent Swedish lika) has approximately the same meaning and exhibits similar syntactic 
behavior. It was already seen in example (7) above that lyjhkedh is a transitive verb that 
takes an accusative object instead of elative or any other local case, for example. Yet 
fully in line with the general object marking pattern discussed in $2, the plural object 
is marked with either accusative or nominative, depending on whether its referent is 
definite (8) or indefinite (9), respectively: 


(8) South Saami (Uralic; SIKOR) 
Im lyjhkh niejtide mah desnie. 
NEG.1SG ike.cG  girlPL.ACC  REL.PL here 


‘I don't like the girls here’ 


(9) South Saami (Uralic; SIKOR) 


Dihte  lyjhkoe | àenehks | mirhke almah, guktie mánnoeh, ` Ajloe 
3SG like.3sc short dark man.PL.NOM like 1DU Ajloe 
foorhkedi. 


laugh.PsT.38G 
“She likes short dark men, like the two of us, Ajloe laughed’ 


In a word, lyjhkedh behaves just like any normal transitive verb of South Saami. By 
contrast, in Lule Saami the analogous loan verb lijkkut like” usually governs the illative 
case instead. As a matter of fact, grammars and dictionaries present the illative as the 
only option (Spiik 1989: 97; Kintel 2012 s.v.), but accusative objects also exist. Both alter- 
natives are present simultaneously in (10) where the illative NPs guolláj and accusative 
dáv gáváv could apparently be exchanged with the accusative guolev and illative dán 
gávváj without a change in meaning:? 


(10) Lule Saami (Uralic; NuorajTV) 
Lijkku guollaj? De ham de lijkku dáv gavav aj? 
like.2sc fishiiL then prr then like.2sG  this.Acc picture.acc also 


“You like fish? Then you must like this picture too, right?” 


However, unlike the nominative/accusative alternation in South Saami, the choice of 
illative or accusative does not seem to be motivated by either semantic or syntactic fac- 
tors. Instead, the most plausible explanation for the variation seems to align with the 
received view on similar variation in North Saami: 


5In the Lule Saami corpus of approximately 800,000 words (SIKOR), nearly half of the 160 instances of the 
verb lijkkut take an infinitive complement. Of the 88 instances with an NP complement, 80 are in the illative 
and 8 in the accusative, with no visible differences in meaning or distribution. Both cases are used to refer 
to singular and plural, animate and inanimate, definite and indefinite referents, for example. 
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(11) North Saami (Uralic; personal knowledge) 


a. Liikot ^ guollái? De han de liikot dán govvii 
like.2sc Debt then prr then like.sc  this.GENACC  picture.ILL 
maid? 
also 

b. Liikot ^ guoli? De | han de liikot dán 
like.zsc  fish.GENAcc then ppr then like.2zsc this.GENACC 
gova maid? 
picture.GENACC also 

c. Lükot guolis? De han de liikot dán govas 
like.2sc fishioc then per then like.sG this.GENACC  picture.Loc 
maid? 
also 


"You like fish? Then you must like this picture too, right?' 


For the North Saami liikot like”, as many as three different cases are available.? North 
Saami is the Saami language with not only the most speakers, but also the most grammat- 
ical research and language planning. As a consequence, the variation seen in (11a)-(11c) 
has attracted the attention of both descriptive and prescriptive grammarians. To put it 
briefly, the use of the illative (11a) is unanimously regarded as the most original North 
Saami, whereas the use of the genitive-accusative and locative are considered interfer- 
ence from Scandinavian (11b) and Finnish (11c), respectively: 


(12) Norwegian (Germanic; personal knowledge) 
Liker du fisk? 
like.prs 2sG fish 


(13) Finnish (Uralic; personal knowledge) 
Pidátkó kalasta? 
like.256.0  fish.ELA 


“Do you like fish?” 


The data in (12) and (13) corresponds to the variation in the Saami languages rather 
directly. However, although continuously rejected by language purists (e.g., Magga 1987: 
127; Cállinrávagirji 2003: 87; Vuolab-Lohi 2007: 425), both the genitive-accusative and the 
locative have accompanied the verb liikot for decades if not centuries. It seems that the 
authenticity of the use of the illative has been taken granted due to the fact that the illa- 
tive was the most common alternative, and nearly the only alternative in earlier periods. 
The most detailed discussion on this issue is presented by Helander (2001: 139) whose 
earliest examples of the ^wrong" cases stem from the beginning of the 20th century, and 


?In accordance with the general patterns of NP morphosyntax (see, e.g., Sammallahti 1998: 100—101), the 
determiner dán (11a)-(11c) remains in the genitive-accusative even when headed by a noun in the illative 
or locative. 
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some instances of the genitive-accusative can actually be found already in the folklore 
recorded and authentic texts composed in the 19th century (see, e.g., Qvigstad 1927: 134, 
190; Ylikoski 2016). From the non-prescriptivist point of view adopted by Helander, it is 
easy to agree that all of the sentences (11a)-(11c) are grammatical North Saami. The differ- 
ence is that only (11a) seems to be shared by the entire speech community, whereas (11b) 
is mainly used by Saami-Scandinavian bilinguals and (11c) by Saami-Finnish bilinguals.'% 

The above-mentioned verbs lyjhkedh (South Saami), lijkkut (Lule Saami) and liikot 
(North Saami) have not been compared with each other earlier, but when this is done, the 
comparison can be extended up to Aanaar Saami where the etymological and semantic 
equivalent of these verbs is lijkkud: 


(14) Aanaar Saami (Uralic; SIKOR) 


Amahán te mij puoh  vissásávt lijkkup kualan ja ráhistep 
Iguess DPT mt all surely Dke mt. fishiiL and love.1PL 
kyele, ko tom ` jyehi  peeivi Siev  puurrámlustoin puurráp (...) 


fish.acc as itAcc every day.GEN good appetite.com eat.1PL 


‘I guess we all really like fish and love fish, as we eat it every day with great 
pleasure (...)’ 


(15) Aanaar Saami (Uralic; SIKOR) 
Kreikkaliih id lijkkum ennuv=gin  syemmilijn. 
Greek.Pr NEG.3PL like.pst.pTCP much=DPT Finn.PL.Loc 
"Ihe Greeks did not like Finns that much, 


To begin with, (14) contains two accusative objects: one for the experiencer verb ráhis- 
tid ‘love’ and one for a more concrete transitive verb puurrád ‘eat’, and for their part 
Aanaar Saami does not differ from the languages discussed thus far. However, lijkkud 
apparently never takes accusative objects, but it does not remain without variation ei- 
ther: the verb governs the illative kuálán ‘fish’ in (14), but the locative syemmilijn ‘Finns’ 


10Tt is noteworthy that the variation exemplified in (11a)-(11c) has never been regarded as anything but 
full synonymy (Magga 1987: 127; Helander 2001: 139, 141; Cállinrávagirji 2003: 87; Sammallahti 2005: 205; 
Vuolab-Lohi 2007: 425). As seen in example triplets such as (i) and (ii), the illative, genitive-accusative and 
locative are used with both animate and inanimate, and both definite and indefinite referents, for example. 


(i) North Saami (Uralic; Cállinrávagirji 2003: 87) 


Mun  liikon dutnje ~ du ^ dus. 
1sc like.1sG  2scG.1L ~  2sG.GENACC ~ 28G.LOC 
‘I like you. 

(ii) North Saami (Uralic; Sammallahti 2005: 205) 
Mun in liiko guolái ~  guoli ~ guolis. 
1sG NEG.lsG  likecNcG Debut ~  fishcEeNAcc ~  fish.Loc 
‘I don't like fish? 
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in (15). Again, the two variants are in free variation, as it would be equally possible to 
replace the illative kuálán with the locative kyeleest, or, vice versa, the locative syem- 
milijn with the illative syemmiláid. Furthermore, quite like with North Saami (11a)-(11c), 
the Aanaar Saami language planners have until recently regarded the use of locative as 
unwelcome Finnish interference, but according to a recent decision of an Aanaar Saami 
language planning organ, both alternatives are now acceptable (Olthuis 2009: 86-87). 

The easternmost Saami languages such as Skolt Saami and Kildin Saami do not share 
the Scandinavian loan verb discussed above, nor do we have large corpora for these 
languages. However, the existing dictionaries and texts support the information provided 
by our colleagues with intimate knowledge of these languages. In Skolt Saami, the verb 
tu’kkeed ‘like’ behaves like North Saami liikot and Aanaar Saami lijkkud in governing 
the locative case as seen in (16) and (17a); neither the accusative, illative nor other cases 
actually occur in the present-day language, although data from traditional dialects also 
include examples of accusative objects, as in (17b), which is deemed ungrammatical in 
today's language: 


(16) Skolt Saami (Uralic; Koponen et al. 2010: 97) 
Mon jidm tüst | tuKKüüm ni ` voops, dóoóst. 
1SG NEG.1ISG  itLoc like.sr.PrcP not  at.all it.LOC 
‘T didn't like that [work] at all’ 


(17) Skolt Saami (Uralic; personal knowledge; confirmed by Tiina Sanila-Aikio (17b) 
from Itkonen 1958: 612; not accepted by present-day speakers)! 
a. Tést jie ru Eed 
itLOC  NEG.3PL  like.cnG 
b. (*)Tó"n jie tu'KKed. 
it.ACC — NEG.3PL like.cNG 
“They don't like it. 


Our last example comes from Kildin Saami, a language that in a way lacks a verb for 
“like”. Instead, sentences denoting liking are centered around the verb miillte “please”, 
and the word referring to the stimulus of liking (tedt larin ‘this country’ in example (18)) 
functions as the grammatical subject of pleasing, whereas the experiencer is marked with 
the illative. Alternatively, it would be possible to resort to the transitive verb SoabSe ‘love’, 
which takes the accusative just like the corresponding verbs in apparently all Saami 
languages (compare example (14) from Aanaar Saami). 


(18) Kildin Saami (Uralic; Lindgren 2013: 240) 


A, MyHH  naobeda madm annb menm moun3. 
ja munn  naad'eda tedt lann meellt torine. 
and 1sG believe.isc this country please.3sG  2SG.ILL 


“and I believe that you will like this country? 


"This claim is based on the data from and judgments by speakers of Skolt Saami in Finland, but the language 
also has some elderly speakers in Russia. 
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What is most interesting in Kildin Saami is that the argument structure of miillte 
“please” (18) is fully the opposite of the most common pattern of the Lule, North and 
Aanaar Saami verbs lijkkut (10), liikot (11a) and lijkkud (14) with which the illative case 
is used to code the stimulus, not the experiencer of pleasure (liking). On the other hand, 
as the illative is also the case of recipients and thus in a way the "dative" case of all Saami 
languages (see, e.g. (5)), the Kildin Saami sentence (18) is conceptually and structurally 
an instance of a well-known type of dative experiencer sentences. 

To summarize, the variation in the coding of liking verbs in the six Saami languages 
described above can be condensed in Table 2.2 For the purposes of the present discussion, 
the focus is on the types of DOM related to the verbs of liking in particular, and the 
more canonical instances of DOM as seen in the plural object marking of South Saami 
in general (examples (1a)-(1b) and (4)) are not repeated here. 


Table 2: Argument marking of liking' in six Saami languages. 


South Saami Bienje lyjhkoe gueliem. 
(Norway, Sweden) 
Lule Saami Bena lijkku guolev - guolláj. 
(Norway, Sweden) 
North Saami Beana liiko guoli ~ guollái ~ guolis. 
(Norway, Sweden, 
Finland) 
Aanaar Saami Peená lijkkoo kuálán - kyeleest. 
(Finland) 
Skolt Saami Piánnai tu'kkad (kuel ~ kue lest. 
(Finland, Russia) 
Kildin Saami IIeuu2 woabawm KYND. 
(Russia) Peenne Soabast kuul. 
dog(.NoM)  like.3sc fish.acc  fish.ILL fish.Loc 
“The dog likes (the) fish’ 


“Personal knowledge; Skolt Saami and Kildin Saami examples provided and confirmed by Tiina Sanila-Aikio 
and Elisabeth Scheller, respectively. For the purpose of visualization, the South Saami example is presented 
in a slightly marked word order (SVO) instead of the most unmarked SOV order typical of the language (cf. 
(1), (4), (5), (7)). In cases of variation, the boldface indicates the variants officially acknowledged by language 
authorities. Kildin Saami Soabast in Table 2 means primarily ‘loves’; for the use of the verb miillte ‘please’, 
see (18) above and (i) below: 


(i Kildin Saami (Uralic; personal knowledge; confirmed by Elisabeth Scheller) 
Ileuue3 ` Menunm K)yJULb. 
Peennge  meellt kuull’. 
dog.ILL  please.3sG fish 


“The dog likes fish? (Lit. ‘Fish pleases the dog’) 
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Table 2 also lists the states in which the examined languages - presented in geograph- 
ical order from southwest to northeast of the Saami territory — are spoken as minority 
languages. In this connection, a number of facts are worth noting: As for the variation 
seen in Lule, North and Aanaar Saami, the use of the illative case is considered the most 
original. Even though public prescriptivist statements about the unwanted influence of 
majority languages have been presented for Aanaar and North Saami verbs only (e.g. 
Morottaja 2007: 33; Vuolab-Lohi 2007: 425), it is also quite likely that the use of the ac- 
cusative in Lule Saami and that of the locative in Skolt Saami are influenced by their 
respective majority languages. When speaking of verbs of liking, two types of foreign 
influences are available. As seen in (12), the Norwegian verb like follows a nominative- 
accusative pattern, but so does its closest Swedish equivalent gilla like”, as well as the 
verb ljubiť “love, like’ in Russian, which has long had a considerable influence on Kildin 
Saami. On the other hand, the use of the Finnish elative - the cognate of the Saami el- 
ative/locative - in (13) easily explains the established use of the locative for the liking 
verbs of all three Saami languages spoken in Finland. To make the role of language con- 
tact even more explicit, it can be pointed out that the use of the Kildin Saami miillte 
“please” in (18) is analogous to that of the Russian nravit'sja “please” (19). However, this 
verb type falls outside the main scope of the present paper. 


(19) Russian (Slavic; personal knowledge) 
A ` nademcb umo  meóe Hpasumca ¿ma ` cmpanua. 
ja | nadejus, čto tebe nravitsja eta strana. 
1sG hope.isG COMP 2sG.DAT please.3sG this.r country 


‘I hope that you will like this country? 


The influence of language contact will be discussed in more detail and with additional 
examples in 84.2 below. However, it must be noted that the Saami languages also exhibit 
DOM that cannot be easily explained away by referring only to interference from ma- 
jority languages. As pointed out by Helander (2001: 140-141), the North Saami suhttat 
'get angry' may take not only the illative and locative cases, but also a postpositional 
phrase headed by ala ‘on(to)’, and only the latter alternative can be explained by the in- 
fluence of the Scandinavian preposition pá ‘on(to)’. Some verbs such as the North Saami 
illudit ‘rejoice; celebrate’ take not only the illative, locative and genitive-accusative, but 
also the comitative case. Furthermore, the more than two thousand occurrences of illudit 
'rejoice; celebrate' in the available North Saami corpus (SIKOR) also include many sen- 
tences in which the stimulus of rejoicing is not marked by any of these four cases, but 
by the postpositions alde ‘on’, badjel ‘over’, badjelii ‘onto’ and dihte “because of”. What 
is more, occurrences of the verb heahpanit ‘be ashamed of’ are accompanied, in addition 
to the four above-mentioned cases, by yet another set of postpositions (alde “on”, badjel 
‘over’, dihte ‘because of”, beales ‘for, on behalf’, geazil for, on account of’ and ovddas ‘for, 
in front of’) (see also Ylikoski 2016). 

To our knowledge, however, language contacts are not the whole story: there are other 
factors at play here as well. It might also be possible to analyze the rich variation in some 
verbs such as the North Saami illudit ‘rejoice; celebrate’ and heahpanit ‘be ashamed 
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of” as combinations of intransitive predicates and optional obliques denoting the cause 
or stimulus of the experience. However, multiple patterns of coding the stimulus are 
generally verb-specific and therefore seem to belong primarily to the realm of argument 
marking. Needless to say, details and possible preconditions of such phenomena in the 
syntactic patterns of individual verbs in North Saami and other Saami languages call for 
further research. The present discussion of a small sample of Saami experiencer verbs is 
the first attempt to outline some possibilities and perspectives on such endeavors. 


4 Discussion 


4.1 Preliminaries 


In the previous section, we have presented some of the variation in the coding of objects 
with experiencer verbs in the Saami languages. The variation is best seen as manifesta- 
tions of DOM, because the marking is not semantically determined in the sense that the 
semantic roles borne by the affected arguments are maintained (the affected argument re- 
tains its role as a stimulus) and the alternation in the marking is not directly determined, 
but only made possible by the verb (i.e., we are not dealing with variation determined 
by the inherent semantics of verbs, as we are in the case of experiencer vs. prototypical 
transitive verbs). The instances discussed here represent restricted predicate-triggered 
DOM, because the described variation is attested mainly for experiencer verbs. More- 
over, the discussed instances of DOM can be claimed to be connected only loosely with 
definiteness, as there are only a few signs that suggest that the variation may be affected 
by habitual vs. concrete reading of the constructions in question. The rationale behind 
the variation differs from that of typical canonical DOM in that the typical triggers of 
DOM, animacy or definiteness, seem to play no role in the cases discussed in this pa- 
per (the possible contribution of definiteness is best seen as a by-product). Finally, the 
variation is not between two structural cases, but rather concerns semantic cases (and 
in some instances also postpositions, as mentioned above). In this section, we will dis- 
cuss the most important contribution of the Saami languages to our understanding of 
DOM in more detail. Basically, three partly competing factors can be seen that add to 
our understanding of DOM: language contact, the semantic emptiness of the cases (or 
other case-like categories) involved in the variation, and the pursuit of coherence. 


4.2 Language contact 


As noted above, the Saami languages are all minority languages spoken in the northern 
parts of Finland, Sweden and Norway, as well as the northwesternmost part of Russia. 
This has the very natural consequence that language contact has influenced and contin- 
ues to influence the structure of Saami languages in many ways, and argument marking 
is no exception in this regard. The major results of this contact were illustrated in Table 2 
above. Table 2 and the following discussion clearly show how the majority languages 
have affected the coding of liking verbs in Saami, given that the most original pattern 
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in Lule, North and Aanaar Saami has been the one in which the stimulus of liking is 
coded with the illative case, whereas the accusative and locative marking are both new 
and analogous to the patterns of the majority languages at the same time. It is also im- 
portant to note that we are not dealing with a transfer of DOM in a language contact 
situation, but rather contact with different languages has produced DOM for a group of 
predicates in the minority languages. 

To give another example of DOM among the experiencer verbs in Saami languages, 
Table 3 presents a likewise condensed collection of the major patterns of expressing 
“caring' and its participants in five Saami languages. The South Saami verb pryjjedh is 
a relatively recent loan from Norwegian and Swedish (bry seg/sig), whereas the Lule 
Saami berustit, North Saami berostit, Aanaar Saami perustid and Skolt Saami peersted all 
go back to Finnish (perustaa). 


Table 3: Argument marking of ‘caring’ in five Saami languages 


South Saami Bienje ij pryjh ^ gueleste ~ gueliem (- guelien bijre). 
Lule Saami Bena ij berusta  guoles ~ guolev (- guole birra). 
North Saami Beana ii bero$ guolis ~ guoli (- guoli birra). 
AanaarSaami  Peená ii peerust kyeleest. 
Skolt Saami Piánnai ij peerst  kuelest. 

dog NEG.38G care.CNG fish.ELA/Loc fish.(cEN)Acc fish.cEN(AcC) about 


“The dog doesn't care about fish? 


In the Scandinavian languages, the stimulus of ‘caring’ is coded with the preposition 
om ‘about’, whereas the Finnish verb governs the elative. It is understandable that Saami 
languages most commonly use the elative/locative case for caring verbs, too, because 
this is probably inherited from the Finnish loan original. On the other hand, it is also 
understandable that the westernmost Saami languages (under Scandinavian influence) 
occasionally resort to the postposition bijre/birra ‘about’, which largely corresponds to 
the most abstract functions of the Scandinavian om. However, at the same time, the same 
languages - South, Lule and North Saami - also witness accusative coding that seems 
likewise absent in Aanaar and Skolt Saami. 

It is probably no coincidence that experiencer verbs are the foremost playground of 
DOM in Saami languages. As noted above, the coding of experiencer verbs often devi- 
ates from the basic transitive pattern of a given language in addition to which there is 
variation in their coding within languages (see examples (11a)-(11c) from Finnish). What 
makes the coding of experiencer verbs in Saami languages interesting is the fact that 
contact with structurally different source languages (governing different cases and ad- 
positions) has created a situation where the coding patterns of the source languages 
mirror the cross-linguistic variation attested within verbs in other languages (e.g. in 
German ‘be cold’ governs a dative subject, while ‘see’ appears in a transitive construc- 
tion). In contrast to typical cross-linguistic variation in experiencer verbs, yet due to 
contact with structurally different source languages, similar variation is reflected within 
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one language and even more so in the group of closely related Saami languages. What 
is also noteworthy here is that the variation seems most evident and productive for 
experiencer verbs; other verbs allow it only to a limited degree, if at all. For example, 
the coding of basic transitive clauses is consistent in the contact languages, because all 
of them are nominative-accusative languages, even though Norwegian and Swedish do 
not code A and OP using cases like Finnish and Russian do. Consequently, there is no 
contact-induced variation in the coding of A and O in prototypical transitive clauses, and 
language contact aids in explaining why obliquely coded arguments have been affected. 
However, borrowing does not follow automatically nor can it be considered random, 
since there are many areas of grammar that have remained largely unaffected in the 
described language contact situations (see, for example, Riefler 2007 for Kildin Saami 
and Russian). An illustrative example is represented by the Finnish variation between 
nominative, accusative and partitive in subject and object coding, which has - in spite of 
occasional translators' and semi-speakers' errors (Magga 1987: 131; Lánsman 2009: 78-79) 
- not gained a significant foothold in any of the three Saami languages spoken in Finland. 
Moreover, the lack of morphological cases for coding core arguments (characteristic of 
Scandinavian languages) is not found in any Saami language. 

As shown above, contact with the surrounding majority languages provides a rather 
good explanation for the variation attested in experiencer verbs in the Saami languages, 
but it is important to distinguish the results of recent language contacts and interference 
from changes that are due to language contact that has become an established part of 
the grammar of the modern languages. Although the data presented above may give the 
impression that the DOM examined here is a recent phenomenon, it has existed in at 
least North and Aanaar Saami for more than a century (see $83), and thus the variation 
cannot be seen as random, but rather an entrenched feature of the languages. Somewhat 
paradoxically, this also underlines the fact that even a seemingly superfluous DOM can 
be a somewhat stable phenomenon that in itself can be resistant to language change. In 
other words, this observation is interesting in light ofthe fact that DOM can be viewed as 
disturbing the consistency in object coding, but Saami data shows that it can nevertheless 
be retained through generations. 


4.3 Emptiness of semantic cases 


As the data discussed in 83 shows, the variation in the O coding (referring to the stimulus) 
concerns a variety of semantic cases (in addition to the accusative also employed for 
this function). Semantic cases, as the label implies, differ from syntactic or structural 
cases (such as the nominative and the accusative) in that they are more directly related 
to a certain semantic function. Across languages, a variety of semantic cases, such as 
the dative and different local cases, are used for marking the arguments of experiencer 
constructions. From the nature of semantic cases it follows that variation between them 
usually has semantic consequences as well; for example, replacing the allative (‘to’) with 
the ablative (‘from’) typically results in a change in the direction of the denoted instance 


BA and O are here understood in the spirit of Comrie (1978) and Dixon (1979). 
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of motion (see also Vásti 2012 for a somewhat different discussion of the allative and 
ablative in Finnish). However, as the discussion in this paper has shown, Saami languages 
provide us with numerous examples of rather free variation between semantic cases. For 
example, in the North Saami and Aanaar Saami examples in (11c) and (15), replacing the 
illative (core meaning ‘to’) with the locative (fat; from’) does not yield any major semantic 
differences in the reading of the clauses. This means that semantic cases are deprived 
of their semantic content when they appear with experiencer verbs. These differences 
reflect the cross-linguistic and also cross-verbal variation in the coding of experiencer 
verbs rather well, but the variation is manifested within one language and one verb. 

One of the main reasons for the loss of semantic content is that with experiencer 
verbs semantic cases are used for coding arguments that are parts of the verb's valency. 
In these cases, the arguments are accorded a semantic role directly by the verb, which 
has the consequence that the exact mechanism used for argument coding becomes less 
relevant, which renders the attested variation understandable. In the Saami languages, 
this has led to the loss of semantic contrast between certain semantic cases when they 
are used for coding objects (and stimuli) of experiencer verbs. The semantic differences 
are, however, relevant in other contexts, especially when the given cases are used for 
coding adverbials. From a synchronic point of view, a given language may select the 
case its contact language employs for coding experiencer verbs without this having any 
consequences for the reading of the construction. On the other hand, the choice of ac- 
ceptable cases is determined - or at least allowed - by the argument structure of an 
individual verb, after all. For example, even though the North Saami verb illudit 'rejoice; 
celebrate' (mentioned at the end of 83) can also code the stimulus using the comitative, 
such an alternative seems entirely impossible with liikot ‘like’. 

The discussion above also underlines the fact that DOM seems to emerge only if the at- 
tested changes do not have any major consequences for the semantic role assignment of 
the affected argument. In typical cases, the variation is between two structural cases that 
are inherently void of semantic content, but the data from the Saami languages shows 
that similar variation is possible also with semantic cases. As the semantic differences 
between the cases have been neutralized, however, the variation has no semantic conse- 
quences. The important feature of experiencer verbs seems to be their differences from 
the basic transitive construction, i.e. the events (or rather states) denoted by experiencer 
verbs rank lower for transitivity, which makes it possible for other cases than the (de- 
fault) accusative cases to be used for their coding. In other words, the exact mechanism 
or case form used for argument coding appears to be less relevant due to decreased tran- 
sitivity, which gives rise to DOM for experiencer verbs in Saami languages. Moreover, 
typical features of transitivity, such as agency and affectedness, are rather irrelevant 
to experiencer verbs in that the stimuli are usually not affected at all and even though 
agency does play a role in cases such as ‘see’ vs. ‘look’, experiencing is always less agen- 
tive and affective than typical transitive actions. This has the consequence that changes 
in these features cannot account for the attested differences in case marking. These fea- 
tures also make experiencer verbs easy targets for semantically rather void DOM. 
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Above, the reasons for the rather free variation between semantic cases in the func- 
tion of coding the O were discussed. However, this might not be the whole story, as 
there are also cases where the variation is not completely semantically free, but it may 
have resulted in a slight change in meaning. When asked about a possible semantic dif- 
ference between accusative and illative in cases such as (11a) and (11b), speakers of North 
Saami may suggest that the accusative is used for concrete liking of fish Oe, when eat- 
ing), while the illative is supposed to refer to a habit of liking fish in general.** In other 
words, the difference between accusative and illative may at least to some extent have 
a semantic basis, and it is also related to semantic transitivity (habituals rank lower for 
transitivity, see, e.g., Gerstner-Link 1998), and, as was noted above, definiteness might 
also play a potential role here, although it is best viewed only as a by-product of the 
attested variation whose ultimate origins seem to lie in language contacts. It must, how- 
ever, be noted that authentic text materials do not obviously support the elicited judg- 
ments on possible semantic differences, and more research is thus needed on this issue. 
In this context, it is also relevant to note that some verbs that describe more intense 
feelings, such as ‘love’ and “hate” (e.g., South Saami iehtsedh, Lule Saami iehttset, North 
Saami ráhkistit, Aanaar Saami ráhistid (cf. (14)), Skolt Saami rá Ksted and Kildin Saami 
Soabpse, all meaning love”), only govern the accusative (and nominative in South Saami), 
which may lend further support to the higher transitivity associated with the accusative 


in (11b). 


4.4 Coherence in marking 


The variation between accusative and semantic cases can also be approached from an- 
other perspective. As suggested above, the variation may be related to a slight semantic 
change in certain cases, but examples like (20) below suggest another reason for this: 


(20) North Saami (Uralic; SIKOR) 
Nuorran  diggejin Beatles  joavkku. 
young.ess dig.pst.isG Beatles group.GENACC 


"When I was young, I dug the Beatles: 


The North Saami verb digget ‘dig’ is a new internationalism whose O argument bears 
accusative coding. In (20), the (genitive-)accusative coding does not necessarily reflect 
a higher degree of transitivity of "digging" (in comparison to liking, for example). This 
can be explained in two ways. First, the occurrence of the accusative can be explained 
by the fact that new loan verbs govern the most common case for O coding, namely the 
accusative, which is used in typical transitive clauses, and, as has been shown, also ap- 
pears with certain experiencer verbs. Second, this may be interference from Norwegian, 
or even English, the ultimate source language for the loan. This, as opposed to the cases 


“Even though native speaker students of North Saami are often aware of the prescriptive grammarians' view 
of the “impurity” of accusative objects with liikot ‘like’, Saami-Norwegian bilinguals at the Sami University 
of Applied Sciences (Guovdageaidnu) and UiT The Arctic University of Norway (Tromsg) have often, when 
asked, suggested this kind of semantic nuance between the use of accusative and illative. 
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discussed above, can be taken as a tendency towards coherence in marking; functionally 
superfluous variation is usually avoided in favor of a more coherent marking system. 
This argument is in line with, for example, Baródal's (see, e.g., Baródal 2008; 2009) find- 
ings on Icelandic and other Germanic languages: in many Germanic languages, the less 
frequent argument marking patterns have disappeared, since the default nominative- 
accusative has replaced them. 

On the other hand, while the accusative coding of digget (20) can nevertheless be 
also regarded as inheritance of the transitive originals such as Norwegian digge and ul- 
timately English dig, the accusative objects of the South, Lule and North Saami verbs 
for 'care' seen in Table 3 seem to be best explained by a language-internal pursuit of 
coherence in marking - even when neither the etymological background of the verbs 
nor the predominant patterns of the majority languages seem to promote the use of the 
accusative. It is notable that the accusative coding of caring verbs coincides with the 
westernmost Saami languages, in which the accusative coding is at least one of the alter- 
natives for the liking verbs as well (Table 2). In other words, while the accusative objects 
of Lule Saami lijkkut (10) and North Saami liikot (11) can be explained as foreign influ- 
ence, Norwegian like in turn may be interpreted as the subsequent model for extending 
the accusative coding to caring verbs as well. 

It is also notable that for some verbs, the multiple outside pressures on minority lan- 
guages may pull the Saami languages in a new but single direction: A case in point 
are verbs for fear”, which, as illustrated in (7) for South Saami, traditionally govern the 
elative/locative case in the Saami languages. However, it appears that not only the tran- 
sitive pattern of Norwegian frykte and Swedish frykta both meaning “fear”, but also the 
partitive coding of Finnish pelátá ‘fear’ have given the impetus for the emergence of ac- 
cusative coding in the Saami languages as well (cf. Vuolab-Lohi 2007: 425; Olthuis 2009: 
86-87): 


(21) North Saami (Uralic; personal knowledge) 


Sii ballet  guliin ~ guliid! 
3PL fear.3PL fish.pLioc ~  fish.PL.GENACC 
“They are afraid of fish!” 


(22) Norwegian (Germanic; personal knowledge) 
De frykter fisk! 
3PL fear.3PL fish 
‘They are afraid of fish!” 


(23) Finnish (Uralic; personal knowledge) 
He pelkäävät kaloja! 
3PL fear.3PL fish.PL.PTV 


‘They are afraid of fish!” 


Again, the accusative coding for verbs of fearing may be seen as strengthening the 
tendency towards coherence in marking. As noted above, the coding of the verbs for 


475 


Seppo Kittilá & Jussi Ylikoski 


“fear” differs from the contact languages in that in Finnish the verb does not govern the 
accusative, but rather the partitive (23), which is common for many experiencer verbs in 
Finnish. However, this has resulted in the accusative coding in North Saami, because the 
language lacks a partitive, and the Finnish partitive can also be seen as a grammatically 


determined structural case.” 


4.5 Theoretical implications 


In the preceding sections, we have briefly discussed the motivation for the occurrence 
of DOM in Saami languages. We have suggested that the variation in O coding follows 
primarily from three different factors, namely language contact, emptiness of semantic 
cases and tendency towards coherence. In addition, transitivity may play a role in cases 
such as (11a), where the accusative (instead of the illative) coding may underline the con- 
creteness of the denoted event, which makes the event in question more dynamic and 
thus more transitive (see, e.g., Givón 1995: 76). In other words, the occurrence of DOM 
constitutes a rather canonical instance of competing motivations. On one hand, contact 
with different languages and the semantic emptiness of the cases used for coding expe- 
riencer constructions produces variation in the marking, while on the other hand, the 
dominance of accusative coding especially with new loan verbs may create coherence in 
marking. Experiencer verbs lend themselves naturally to this kind of variation, because 
their lower degree of transitivity favors the use of semantic cases for their coding. It 
is easy for a language to adopt the coding pattern of a surrounding majority language 
in this kind of case, and in many of the discussed instances, the coding pattern of the 
majority language is mirrored in the given Saami language. The future will show which 
of the motivations will be stronger. 

Another question related to the data discussed in this paper concerns the emergence 
of DOM. Recently, Iemmolo (2011) has argued that the occurrence (and emergence) of 
DOM is best explained by topicality. In other words, topical objects gradually start re- 
ceiving explicit (non-zero) marking, which eventually results in a fully grammaticalized 
DOM system. What is interesting from a cross-linguistic perspective is that animacy and 
definiteness, typically seen as the hallmark features of DOM, are not in any direct way 
related to the cases discussed in this paper (see also Iemmolo 2011 for a recent discussion 
based on topicality); the possible effects of definiteness are only indirect. This means that 
DOM cannot be exhaustively explained by animacy and definiteness (or topicality), but 
the data from Saami languages provides another kind of view to the development of 
DOM instead. First of all, the type of DOM examined here appears to be most common 
within a certain verb class only, namely experiencer verbs. This means that semantics 
makes an important contribution to its occurrence. As noted numerous times in the pa- 


The Saami accusative is also historically directly connected to the Finnish partitive, as the Saami plural 
accusative ending is cognate to the Finnic plural partitive, and both North Saami guliid [fish.P1.cENAcC] 
(21) and Finnish kaloja [fish.rL.prv] (23) thus go back to a common proto-form "kala-j-ta [fish.pr.Prv] 
(Sammallahti 1998: 68, 203-206). This is possibly further reflected in the fact that Saami-Finnish bilinguals 
and Finnish learners of Saami languages often tend to equate the Saami genitive-accusative with the Finnish 
partitive (Magga 2002: 131; Lánsman 2009: 78-79). 
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per, the coding of experiencer verbs varies both within and across languages. This might 
be the main reason for the fact that they are so prone to external influences. In principle, 
the language has no reason to resist the emerging variation, because it is not connected 
to any major semantic differences. For example, the differences between accusative and 
illative are not related to any semantic differences in the case of experiencer verbs, be- 
cause, as noted also above, the affectedness of stimuli is not a relevant feature with them. 
This is in line with more common manifestations of DOM, where the main consequences 
of DOM are pragmatic in nature, i.e. they do not affect the semantic roles of arguments. 

The data from the Saami languages does not provide us with a clear answer to the 
question of how and why DOM emerges in more general terms, but it aids us in under- 
standing the circumstances under which it may arise. Favorable conditions are present 
if the variation is between two structural (such as nominative and accusative) or two 
semantic cases (such as illative and locative), and the variation is thus not related to 
any major semantic differences. The differential coding of topical objects also lacks an 
obvious semantic motivation (see Iemmolo 2011), but with time, the seemingly arbitrary 
variation in object coding acquires pragmatic functions. On the other hand, animacy ef- 
fects on the coding of goals, for example, are more dramatic in nature, because we are 
also dealing with differences in roles of the affected arguments (see Kittilà 2008 for a 
more detailed discussion). It remains to be seen whether the kind of DOM attested in 
Saami languages will become more functionally triggered in the future. In any case, it 
is clear that at this point, the DOM in the Saami languages is predicate-triggered and 
only time will tell whether it will extend to objects in more general terms, and whether 
it will give rise to more evident semantic differences between the alternatives that are 
now best seen as free variation. 

Another thing that the data discussed in this paper may shed more light on is the se- 
mantic nature of cases used for coding arguments that belong to the valence of a given 
verb. The typical structural cases, most notably nominative, absolutive, accusative and 
ergative, are semantically rather void of any specific meaning and usually get their se- 
mantic role from the verb. Their use is more directly related to distinguishing between 
A and O. The DOM discussed in this paper provides us with a somewhat different kind 
of evidence for the semantic emptiness of these cases, because cases that are prototypi- 
cally best regarded as semantic behave as structural cases instead. In other words, in the 
data discussed in this paper, the employed case forms receive their meaning from the 
verb instead of having independent semantics of their own, even though we are dealing 
with semantic cases. The object slot is inherently related to a certain kind of semantic 
role, and the formal requirements outrank the inherent semantics of the employed case 
forms. 
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Abbreviations 
1 first person IMP X imperative 
2 second person INF infinitive 
3 third person toc locative 
ACC accusative NEG negation 
CNG connegative NOM nominative 
COMP complementizer PL plural 
DPT discourse particle PRS present 
DU dual PST past 
ELA elative PTCP participle 
ESS essive PTV ` partitive 
F feminine Q question marker 
GEN genitive REL relative 
GENACC  genitive-accusative SG singular 
ILL illative 
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Chapter 17 


The emergence of differential case 
marking 


Sander Lestrade 
Centre for Language Studies, Radboud University 


This paper shows that grammatical argument marking need not be inherent to language 
but can result from language use. For this, a computer model is used that simulates the 
emergence of differential case marking in artificial protolanguages in which only lexical 
expressions and very general communicative principles are used. Agents check the expected 
success of their utterances and initially add lexical ad hoc markers to make the distributions 
of roles clear if deemed necessary. Such role markers need not be very specific, as they 
only have to distinguish between maximally two, often very different, predicate roles. Over 
time, as popular marking solutions become less costly to produce and irrelevant meaning 
dimensions are removed from their lexical representations, case markers may develop. It is 
also shown how this development can be impaired if alternative strategies, such as Agent 
First, are used. 


1 Introduction 


The goal of this paper is to show that grammatical argument marking need not be inher- 
ent to language but can result from language use. Instead of taking a more traditional 
approach for this, i.e. by tracing strategies from modern languages back to their histor- 
ical roots, a computer model is used that simulates the emergence of differential case 
marking in artificial protolanguages, languages without a formal grammar. 

In the next section, it is explained how the roles of event participants can be com- 
municated in protolanguage. In $3 and 84, the way in which event communication and 
language change are modeled will be explained. 85 shows the results of the simulation, 
which will be discussed in 86. 


2 Event communication in protolanguage 


It seems reasonable to assume that language began as a set of referential expressions 
only, without any rules of grammar. In this first phase, called protolanguage by Bickerton 


Sander Lestrade. The emergence of differential case marking. In Ilja A. Seržant 8 
| Alena Witzlack-Makarevich (eds.), Diachrony of a argument marking, 437- 
461. Berlin: Language Science Press. 
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(1981), speakers had to use more general communicative principles to communicate who 
did what to whom. On the basis of present-day language variation, we can hypothesize 
at least four principles for this: Typing, Grouping, AgentFirst, and CheckSuccess. These 
protoprinciples are nothing but simple marking strategies and interpretation heuristics 
that speakers can be assumed to use in the absence of standard rules of grammar. As will 
be shown in 85, they do in fact suffice for successful event communication. 


2.1 Typing 


Typing is the preference of predicate or semantic roles for specific performers (Aristar 
1996; 1997). For example, the predicate READ asks for an external argument that is sentient 
and an internal argument that carries readable information. If an argument sufficiently 
suits the predicate role that it is assigned, nothing needs to be done to ensure the correct 
interpretation beyond uttering the forms referring to the concepts. If there is a mismatch, 
however, the argument needs to be forced into its role in order for the hearer to under- 
stand the utterance correctly (cf. Aristar 1996; 1997 for the original account pertaining 
to semantic case and Lestrade 2010; 2013 for a generalization). The effect of Typing in 
the argument domain can be seen in the case-marking systems of modern languages in 
which, for example, only unexpected (subject) role performers are marked, such as yaga, 
‘pig’ (Donohue & Donohue 1997). 


(1) Fore (Nuclear Trans New Guinea; Scott 1978: 115-116) 


a. Yaga:-wama wa ` aegüye. 
pig-ERG man  3SG.0BJ.hit.35G.SU.IND 
“The pig attacks the man? 
b. Yaga: wá ` aegúye. 
pig man  3SG.0BJ.hit.35G.SU.IND 
“The man kills the pig? 


2.2 Grouping 


The addition of explicit marking of roles by the speaker of course requires the hearer to 
combine these markers with the referential expressions whose roles they should make 
explicit. This involves a simple Grouping principle that says to “interpret together” what 
stands together (Givón 1995; Jackendoff 2002). Indeed, if one is asked to make sense of 
the string of words car green stand in front of house yellow, most probably one will say 
that the car is green and the house is yellow, not the other way around. The principle 
can be also observed in case-marking systems where case concord only takes place if 
a modifier is separated from its head, as in Warlpiri. If the modifier is adjacent to its 
head, as in (2a), their grouping follows automatically and need not be marked; if they 
are separated, as in (2b), the ergative case suffix is duplicated. 
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(2) Warlpiri (Pama-Nyungan; Hale 1973, cited in Blake 2001: 96) 


a. Tyarntu wiri-ngki=tyu yarlki-rnu. 
dog big-ERG-1.G.OBJ bite-PsT 
"Ihe big dog bit me’ 


b. Tyarntu-ngku-tyu  yarlki-rnu — wiri-ngki. 
dog-ERG=1.5G.0B]  bite-PsT big-ERG 
“The big dog bit me’ 


2.3 AgentFirst 


Typologists have firmly established the cross-linguistic word-order preference "subject 
precedes object (S<O)”. But Dryer (2013), for example, says that in his study subject and 
object are used "in a rather informal semantic sense, to denote the more agent-like and 
more patient-like elements respectively? More generally, Siewierska (1988: 8) notes that 
when determining basic word order, linguists often only consider *stylistically neutral, 
independent, indicative clauses with full noun phrase participants, where the subject 
is definite, agentive, and human, the object is a definite semantic patient, and the verb 
represents an action, not a state or an event". This means that word-order generaliza- 
tions that say that S«O (often) are really about semantic roles (rather than grammatical 
roles) and claim that the more agentive participant should precede the other one. The 
AgentFirst principle is sometimes explained through iconicity, an agent prototypically 
instigates an event that affects a patient and hence the agent part of the event precedes 
that of the patient in time (DeLancey 1981; Croft 1991: 185; Anderson 2006). Whatever 
the explanation, even if grammatical notions such as subjects and objects cannot be 
used, people can assume that John hit Pete if someone said John hit Pete. Indeed, this 
preference can be observed in the speech varieties of second language learners (cf. Klein 
& Perdue 1997). 


2.4 CheckSuccess 


Whereas Donohue & Donohue (1997) analyze the differential use of case marking in (1) 
in terms of Typing, Scott (1978) himself proposes a global account. In this analysis, what 
matters is whether the argument qualifies significantly better for its role than the other 
argument that could be assigned to it in its stead. This second type of reasoning is sub- 
sumed under the CheckSuccess principle, which checks whether an utterance is likely 
to be interpreted correctly taking into account all possible cues, including the effect of 
other marking principles. Unlike these other principles, which may or may not be active 
in a speech community, the CheckSuccess principle is understood to be universal, as 
the goal of communication is to be understood. In many (theoretical and computational) 
models, this involves speakers pretending to be hearers to check whether they them- 
selves would get the right meaning (cf. Levelt 1983; Hurford 1989; Zeevat 2000; Steels 
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2003; Blutner et al. 2006; de Swart 2011). Note that it is hereby assumed that in principle 
speakers preferably use as little effort as possible and only elaborate if checking shows 
that a short expression does not suffice (Grice 1975; cf. Lestrade et al. 2016 for a more 
comprehensive discussion). 

As with typing, if deemed necessary, the role distribution is made clear through ad- 
ditional marking. But note that CheckSuccess is more self-restrained: Typing uses addi- 
tional role marking even if this turns out not to be strictly necessary for communicative 
success. For example, although goats may not form the best readership, they are still 
more qualified for it than books, which are much more likely to be read. If a goat is 
said to read a book, it should be marked according to Typing, but not for CheckSuccess. 
Whereas much could be said for a Typing scenario (thus reducing the role of economy in 
grammar and increasing the desire to prevent miscommunication),* it is also interesting 
to see what happens if other protostrategies preempt the use of additional marking. 


2.5 Other general principles 


When formulating and interpreting an utterance in a protolanguage, the four principles 
just mentioned can be used. For example, if speakers think the role distribution does not 
follow from Typing, they can use additional words specifying the verb-specific role of the 
arguments. The hearer in turn will use these words by Grouping to assign arguments to 
their roles. In addition to these heuristics for role disambiguation, several other general 
cognitive and communicative principles are assumed to play a role. Differently from the 
principles discussed above, they do not contribute to marking the argument structure 
directly. Instead, they influence the development of strategies with this purpose. These 
other principles are mentioned only briefly here; a more elaborate discussion follows in 
$3 and 84. 

In the selection of words from the mental lexicon, it is assumed that both frequency 
and semantic weight play a role (cf. Section $3.2 for the implementation of semantic 
weight). In real language, the activation threshold is lower for frequent items (Balota & 
Chumbley 1985); in the model, frequent and semantically "light" items take precedence 
when the lexicon is searched for an expression. 

When utterances are actually produced, frequently used and predictable items are 
pronounced sloppily (Jurafsky et al. 2001). As hearers change their form representation 
on the basis of what they hear, forms may subsequently erode over time (Nettle 1999). 
When words become too short to stand on their own, they are suffixed to the preceding 
word (or prefixed to the following one, an option that is not explored in the model). 

People also seem to keep track of the actual usage of words and may change the mean- 
ing representations accordingly (Bybee 2010). If a word is found in a large variety of 
contexts, the dimensions along which these contexts differ most are removed from the 
lexical representation of this word. 

Finally, in many languages, given information is communicated before new informa- 
tion. This is arguably done to provide some sort of mental anchor for smooth processing. 


"This possibility was suggested to me by Fred Weerman. 
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Whereas such information-structuring preferences have grammaticalized completely in 
a language like Hungarian (É. Kiss 2002), they can be observed as soft constraints too 
in the speech varieties of second language learners (Klein & Perdue 1997). In the model, 
TopicFirst preposes the topic, after which an anaphoric copy is put directly after the verb 
for cross-referencing, yielding the equivalents of constructions like the man, I saw him 
(following the proposal of Givón 1995).? 


3 Modeling communication (in protolanguage) 


In Lestrade et al. (2016) the development of differential case marking in protolanguages is 
simulated too. We show that in communication systems in which initially only very gen- 
eral principles are involved, rules that say that animate objects need to be case-marked 
can be derived from the automatization of ad hoc repair solutions for imminent com- 
municative failure. There are some limitations to our previous study, however, as sug- 
gested by the definition of differential argument marking given by Witzlack-Makarevich 
& Serzant (2018 [this volume]): 


(3) Differential Argument Marking: 
Any kind of situation where an argument of a predicate bearing the same 
generalized semantic role (or macrorole) may be coded in different ways, 
depending on factors other than the argument role itself. 


Note that the way in which an argument is coded is left unspecified here. This is 
done for good reasons, as coding can be achieved in different ways: through word order, 
indexing, and flagging (or, using more traditional terms for the latter two, head marking 
and dependent marking). Although it could be hypothesized that these strategies are 
mutually exclusive (case marking for example freeing up word order for other uses, as is 
sometimes claimed; cf. Blake 2001: 15), most languages in fact combine multiple strategies 
(Lestrade 2015b). Also, the definition does not specify the factors that drive differential 
argument marking. Again, this is appropriate. Although many studies on differential 
argument marking focus on animacy, other factors play a role too (in fact, animacy even 
seems to play a subordinate role cross-linguistically according to Sinnemáki 2014). This 
means that if we want to understand the (differential) usage of individual strategies, 
we should take into account the larger argument-marking environment in which they 
partake. That is, we should consider various marking options and factors. 

In the present study, therefore, a more elaborate simulation, viz. WDWTW, will be 
used in which neither the discriminating factor (animacy) nor the solution (flagging of 
the object) of the communication problem is predefined. 


?In Lestrade (20152), it is shown how these cross markers can eventually develop into indexes (agreement 
markers); in this study, they will largely be ignored. 
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3.1 The simulation model 


WDWTW (for who does what to whom) is a cognitively motivated multi-agent model 
that simulates language use and change.’ The agents of WDWTW live in a very abstract 
virtual world, in which their only goal is to communicate successfully. Agents consist of 
a lexicon of object and action words, a common ground of recently discussed objects, a 
set of cognitive and communicative principles as discussed in the previous section, and 
a usage history which keeps track of the contexts in which the words have been used. 
The lexicon, common ground, and usage history are agent-specific and change over time; 
the principles are constant and shared by the population. 

Agents die after 3000 utterances and procreate at the age of 2250, at which point their 
lexicon is inherited by (i.e. taught faultlessly to) their offspring, save for minor modi- 
fications to the meanings of those words that have not been used until then. Thus, in 
the present proposal, language change is not so much the result of ‘faulty’ learning, but 
rather that of processing constraints (cf. $4). The age of procreation is not meant to be 
representative: The overlap between generations is kept short to speed up the simulation, 
while remaining large enough in order for new generations to learn the basic usage pat- 
terns from their parents. As the development and maintenance of a conventional lexicon 
has been successfully modeled elsewhere (e.g. Hurford 1989; Hutchins & Hazlehurst 1995; 
Steels 1997; Kirby 2000), the present simplifications seem warranted. 

The conversation procedure is given in (4). The general idea is that two agents, a 
speaker and hearer, find themselves in a situation in which multiple events are going 
on at the same time. The speaker wants to talk about one of these events, for which 
it formulates an utterance using the protoprinciples available in its speech community. 
Next, the hearer has to identify which of the events the speaker was talking about. Con- 
versations last between 10 and 30 utterances (for each of which a new situational context 
is developed on the basis of the (conversation-dependent) common ground). 


(4) Conversation Procedure 
1 Select two agents and randomly create initial common ground 
2. Create situational context on the basis of common ground 
Speaker: 
4. Develop initial proposition for target event 
5. Apply protoprinciples to develop proposition further 
6. Check expected communicative success and elaborate if necessary 
7. Produce utterance 
Hearer: 
8. Analyze words in utterance 
9. Group words into constituents 
10. Determine argument structure 


?Eventually, a user-friendly version is to be included in the CRAN archive. Meanwhile, the codes are avail- 
able from the author on request. Note that virtually all assumptions can be manipulated through the use 
of model parameters. The most relevant parameters for present purposes are given in the appendix. 

"This set up is taken from Steels (2003) and was suggested to me by Simon Kirby. 
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11. Identify target event 


12. Update lexicon, usage history, and common ground of speech participants 
on the basis of success 
13. Switch speech roles and start again at Step 2 or stop. 


These steps (save the first and the last, which are self-explanatory) will be discussed 
in turn below. But as the lexicon serves as the basis for most procedures in (4), I will first 
explain how the mental lexicon is represented. 


3.2 The lexicon and vector comparison 


Following Wierzbicka (1996), natural-language concepts can be decomposed into mean- 
ing primitives such a CONCRETE, HUMAN, MALE, etc. (cf. also Guiraud 1968). Some- 
what similarly, Gárdenfors (2000) argues that concepts are sets of values along different 
meaning dimensions. Thus, we can think of a cat as something that is time-stable, con- 
crete, alive, four-legged, tailed, etc. Note that whereas the initial meaning dimensions in 
such characterizations are very general and bisect the world (e.g. time-stability), eventu- 
ally meaning dimensions become more and more specific in order to single out a concept 
(e.g. having a tail). 

Abstracting away from the quality of the dimensions that organize our mental lexi- 
con, the nominal lexicon of the agents is modeled as a list of randomly generated forms 
with values along several numerical meaning dimensions (their vector representation). 
Following the observation just made, the dimensions make an increasing number of dis- 
tinctions (the first five are binary, the next four make nine distinctions; for computational 
reasons, values are restricted to the 0-1 range). These dimensions may be taken to repre- 
sent whatever properties are grammatically relevant for the linguistic behavior of words 
in natural language, but the model does not commit to any such specific interpretation. 
Table 1 shows six different noun meanings that are specified for nine dimensions. 


Table 1: First entries in the noun lexicon 


D1 D2 D3 D4 D5 D6 D7 D8 D9 ID form 
100 0.00 1.00 100 000 075 025 100 100 1 atadoso 
100 1.00 0.00 100 1.00 0.38 0.38 0.62 0.88 2 nimator 
100 1.00 0.00 0.00 0.00 0.62 0.50 025 0.62 3 umimota 
100 0.00 1.00 000 100 100 012 100 0.62 4 isomera 
0.00 0.00 1.00 1.00 1.00 0.00 025 075 0.00 5 enolate 
100 1.00 100 0.00 0.00 0.88 0.75 075 012 6 romutil 


Verbs are specified similarly, as shown in Table 2, with the addition of one or two 
perspectival roles, viz.the external and, in the case of a two-place predicate, internal ar- 
gument role (cf. the (Neo-)Davidsonian approach in which an event argument is thought 
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of as an argument itself, which needs to be characterized accordingly; Davidson 2001; 
Parsons 1994). These roles are also characterized using vector representations. And here 
too, one could think of each meaning dimension as one that is grammatically relevant 
in natural language (+instigating, tintentional, affected, etc.), although these notions 
have no meaning in the model. 

In natural languages, higher perspectival roles (which become the subjects of simple 
sentences) have a preference for "prominent" features (Dowty 1991; Primus 1999; Yip et 
al. 1987). For example, the animate and volitional reader and not the inanimate book is 
the external argument of to read. To model this, external role specifications are assigned 
higher values on average. If we understand high numbers as prominent features, we can 
thus say that external roles are more prominent than internal ones in the model too. 


Table 2: First entries in the verb lexicon (abbreviated). Columns D1:9 define the 
action itself, Ext1:9 characterize the external role, Int1:9 the internal one. 


D1...D9  Exti..Intl Ing... type ID form 
1.00...0.50 1.00...0.00 0.00... twoPlace 1 rirunes 
100..0.50 1.00...0.00 1.00... twoPlace 2 amumali 
1.00 ... 0.75 0.00...1.00 0.62... twoPlace 3  emimano 
0.00...0.75 0.00...0.00 0.38... twoPlace 4 litaril 
100...100 1.00...0.00 0.25... twoPlace 5 adasumu 
0.00...0.75 1.00...1.00 0.12... twoPlace 6 edesito 


Vector representations play an extremely important role in the model, for example 
in word selection and determining typing scores. The match between two vectors is 
determined by calculating the average (absolute) difference per meaning dimension, and 
subtracting this from 1, in which dimensions that are not specified are ignored. Given 
the range of possible values (0-1), a score of 1 shows a perfect match, a 0 shows maximal 
deviation. 

For concreteness, the typing score of atadoso for the external role of rirunes is calcu- 
lated in Table 3. Note that it is thus assumed that the noun and role dimensions corre- 
spond to each other. That is, the first dimension of a predicate role concerns the same 
feature as the first dimension characterizing an argument. 


"Here, the external argument is understood as the “lexical subject”, i.e. the participant whose perspective on 
the event is taken by the corresponding verb. In a standard declarative sentence (in English), the external 
argument corresponds to the subject. 

It may seem redundant to specify both the action and the roles of its participants. But on the one hand, 
an event involves more than just the activities of the core arguments (e.g. cooking involves heat and pans 
and is done for the purpose of eating, which does not follow from what the cook and the food themselves 
“do”). But also, it seems the very same event can be described using different perspectives, which therefore 
involve different argument roles (cf. buy vs. sell, eat vs. eat a sandwich, and sweep the table vs. sweep the 
crumbs). 
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Table 3: The typing score of atadoso for the external role of rirunes 


representation atadoso 1 0 1 10 75 25 1 1 
representation rirunes 1 1 10 0 0 1 125 0 
absolute difference 0 1 0 1 0 75 75 .875 1 
mean difference .60 
typing score .40 


3.3 Step 2: Create a situational context 


Events in a situation could simply be generated as a collection of randomly generated 
numbers, the subsets of which constitute the various event ingredients (i.e. the action 
and event participants) for which the speaker has to find the best words available. In- 
stead, the lexicon is used as a starting point for this, as it makes sense to assume a link 
between the meaning of words and the structure of the world. In real life, the classi- 
fication of the world into the categories that words denote follows from the logic and 
organization of the world as perceived by the speakers of some language: we coin and 
maintain words for those meanings that are cognitively and culturally relevant (cf. e.g. 
Jackendoff 2012 for the same intuition.). We can use this link the other way around in 
the simulation, generating events by sampling meanings from the lexicon and taking 
their combinatorial possibilities into consideration. Thus, for transitive events, objects 
from the common ground are randomly selected to instigate the events. Next, verbs are 
sampled from the lexicon on the basis of the match between the properties of the ob- 
jects and the external-role specifications of the verbs. Finally, a second set of objects is 
sampled from the common ground and from the lexicon on the basis of their match with 
the internal role of the verb, in which the odds for a new object from the lexicon are 1/6. 
For intransitive events, a set of objects is sampled from either the common ground or 
the lexicon with the odds just mentioned, after which verbs are sampled on the basis of 
the (external) role match. At each step, a certain amount of noise is added, as a result of 
which “real world" entities are not always perfect instances of “mental representations" 
and event participants are not always the ideal performers of their roles. 

Table 4 shows an (abbreviated) example of a situation in the model. The V columns 
refer to the characteristics ofthe actions that are ongoing, A refers to the referential prop- 
erties of the more agent-like participants, the actors, and U refers to those of the “other” 
participants, the undergoers (after Van Valin 1999). Which grammatical and semantic 
roles these participants receive depends on the verb that is chosen by the speaker to 
conceptualize the event, which need not be the same verb that was used to develop it (cf. 
again the contrast between buy and sell from Footnote 5). The (5) column identifies the 
event that is to be communicated. 
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Table 4: First six events of a situation (abbreviated). V1:V9 show the properties 
of the actions, A1:A9 the referential properties of the actors, U1:U9 those of the 
undergoers, while the target column identifies the event of interest. 


V1... V9 Al... A9 U1...U9 target 


0.00 ... 0.12 0.00 ... 0.50 
0.00 ... 0.12 0.00 ... 0.25 
1.00 ... 0.75 1.00...1.00 1.00... 0.38 
1.00 ... 1.00 0.00...100 1.00... 0.38 
1.00 ... 0.50 1.00...1.00 0.00 ... 0.50 
0.00 ... 0.875  1.00..1.00 0.00... 0.25 


kä Ei Gi Ei Ei Ei 


3.4 Step 3: Develop initial proposition 
The full target event to be communicated in the situation shown in Table 4 is given in (5). 


(5) Target event 
Vi V2 V3 V4 V5 V V7 V8 v9 Al A2 A3 
0.00 0.00 0.00 0.00 1.00 0.25 0.25 0.375 0.875 1.00 0.00 0.00 
A4 A5 A6 A7 A8 A9 U1 U2 U3 U4 U5 U6 U7 
100 0.00 0.125 0.5 0.875 1.00 0.00 0.00 0.00 1.00 1.00 0.625 0.00 
U8 U9 
0.875 0.25 


The speaker now first selects referential expressions for the ingredients of the target 
event, i.e. the action itself and the event participants. Conceptually, it searches for those 
lexical items that suffice to identify their referents in the situational context (as the ex- 
pressions have to be sufficiently distinctive given the distractors in the other events). 
Computationally, it compares the vector representation of the referent with all meaning 
representations in its lexicon and next checks if the match between the meaning repre- 
sentation of the preferred item is sufficiently distinct from the distractor vectors in the 
situation (that is, better by at least 0.05). 

The order in which items are considered for expression is only partly determined by 
the vector match. Frequency of use and semantic weight are also taken into consider- 
ation. The first factor prefers frequently used forms, the latter "light" meanings, which 
are specified for less meaning distinctions. 

For the target event in (5), the initial proposition is given in (6). As shown by the 
referential match value (refMatch; other values will be discussed when relevant), neither 
of the nouns perfectly describes their referents, but, apparently, they suffice given the 
context. Note further that the order of the referential ingredients has been randomized. 


For example, if the target object is a 0 1 1 and the only distractors are a 0 1 0 anda 111, the selected expression 
at least has to specify the first and the third dimension, but need not represent the second faithfully as it 
is not distinctive. Thus, if the lexicon contains the lexemes 0 1 1, 0 0 1, and 0 - 1 (in which “—” means not 
specified), all three could be used successfully, but the third will be preferred because of its lower semantic 
weight. 
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(6) Initial proposition 
a. Internal argument 
D1 D2 D3 D4 D5 D6 D7 D$ D9 ID form freq 
0 0 0 1 1 0.375 0 0.875 0.125 43 leludor 0 
argFreq nounFreq verbFreq recency semWeight refMatch  collFreq 


0 0 0 51 1 0.9583333 0 
topic Typing 
1 0.75 


b. External argument 
D1 D2 D3 D4 D5 De D7 D8 DI ID form freq argFreq 
1 0 0 1 0 0.25 0.375 0.75 1 50 inideta 0 0 
nounFreq verbFreq recency ` zem Weight refMatch  collFreq topic 
0 0 51 1 0.8472222 0 0 
Typing 
0.625 

c. Verb 
Di D2 D3 D4 D5 De D7 D8 D9  Exti Ext2 Ext3 Ext4 
0 0 0 0 1 0.25 0.25 0.375 0.875 0 1 0 1 
Ext5 Ext6  Ext7  Ext8 | Ext9 Intl Int2 Int3 Int4 Int5 Inte Int7 
1 0.125 0.375 0.625 0.875 0 0 0 0 0 0.25 0 


Int8 Int9 type ID form freq recency semWeight 
0.875 0.25 twoPlace 126 nulotos 0 51 1 
refMatch collFreq topic 

1 0 0 


3.5 Step 4: Apply protoprinciples 


Depending on the protoprinciples that are active in a speech community, various opera- 
tions are performed at this step. Here, agents from the first generation of three different 
lineages will be discussed for illustration; in $5, all other possible combinations are dis- 
cussed. 

In the AF lineage, AgentFirst is used as a marking strategy, because of which a speaker 
puts the actor participant in first position. The actor is understood as performing the 
more prominent verb role, and as explained in 83.2, higher values stand for prominent 
features. Thus, as the initial values of the external role of the verb are higher than those 
of the internal one, the external argument is found to be the actor, and is therefore put 
in first position.” As nothing else changes, the representation is shown in abbreviated 
form only in (7): 


In this comparison, the first few values are deemed more important than the later ones. The dimensions of 
the two role vectors are compared one by one, starting with the first, and as soon as a difference is found 
between two corresponding values (the second, in the present example), the vector in which the highest 
values is attested is considered as the actor role. 
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(7) AF proposition 
a. External argument 


b. Internal argument 
c. Verb 


After having placed the actor in first position, a speaker of the AFTF lineage, in which 
TopicFirst is also active, preposes the topic of the utterance. In addition, an anaphoric 
copy is added as a verbal marker for cross reference (following Givón 1995). In the ini- 
tial phases of language development, this cross marker is often the same word as the 
antecedent (cf. the identical values for marker ID and marker target in (8), but once pro- 
nouns have been developed, more general items can be used for this. As illustrated in 
the present example, TopicFirst may interfere with AgentFirst. Whenever the undergoer 
happens to be the topic, it will be put in first position in spite of AgentFirst. In the model, 
actors are five times more likely to be the topic of communication than undergoers. 


(8) AFIF proposition 
a. Internal argument 


b. External argument 


c. Verb 
intMarkerID | intMarker | intMarkerTarget | intMarkerFreq 
43 leludor 43 0 


The AFTFTC lineage uses all available marking principles by also including TypeCast, 
the production instantiation of Typing. The same initial procedure is followed as in the 
previous lineage (since AgentFirst and TopicFirst apply too). In addition, however, the 
speaker now considers whether event participants qualify for their roles. If the Typing 
score is below .7, the speaker searches its noun lexicon to look for the best expression to 
make this role explicit. As the Typing score of inideta in (6) shows, it falls short for its 
external role. The best expression to remedy this is found to be rurutis, which is added 
to the representation of the external argument (again, only the changes that are made 
with respect to the initial proposition are shown): 


(9) AFIFIC proposition 
a. Internal argument 


b. External argument 
markerID | marker | markerFreq 


.. 916 rurutis 0 
c. Verb 
intMarkerID | intMarker ` intMarkerTarget | intMarkerFreq 
43 leludor 43 0 
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3.6 Step 5: Check success and elaborate if necessary 


In Step 5, speakers determine whether their derived proposition would be understand- 
able if uttered as such. If the role distribution of the arguments is made explicit by Type- 
Cast or can be told using AgentFirst, communicative success is assumed. If these princi- 
ples do not apply or lead to the wrong result, speakers check whether the typing score 
of the arguments for their own roles are distinctively above the scores of the other argu- 
ments for these roles. If so, the hearer should be able to derive the meaning nevertheless. 
If not, a marker is added to make the role of the second argument explicit, where the 
ambiguity first arises (assuming incremental processing). 

When the speaker ofthe AF lineage checks whether the meaning ofits derived proposi- 
tion would follow sufficiently from the selected combination of lexemes, it assumes that 
its hearer will use the same AgentFirst principle in interpretation. Also in the AFTFTC 
lineage no further action is needed, as bad performers were explicitly marked for their 
role in Step 4 already. The speaker ofthe AFTF lineage, however, finds out that the hearer 
cannot derive the intended meaning. Because of the preposed undergoer topic, Agent- 
First would lead to the wrong result. As the actor happens to come second, it is again 
the actor role that has to be made explicit, and the same proposition as in (9) is derived. 


3.7 Step 6: Produce the utterance 


In principle, speakers simply utter the lexemes present in the derived proposition at 
this step. However, forms are “pronounced” sloppily if they are frequently and recently 
used, or predictable in their context (Jurafsky et al. 2001). Words are predictable if they 
frequently co-occur in specific relations, such as external-argument-of (shown by the 
collostruction value collFreq in (6)). Sloppy pronunciation is operationalized by replacing 
the last vowel/consonant of a word by the preceding vowel/consonant in the alphabet, 
or removing it altogether if this is no longer possible (in the cases of a and b). As none of 
the items in the example above meet the requirements for reduction, none of the forms 
is reduced. Thus, we arrive at the utterance in (10a) for the AF speaker and at that in (10b) 
for both the AFTF and AFTFTC speakers. 


(10) a. inideta leludor nulotos 
“Inideta nulotoses leludor. (AF) 
b. leludor  inideta ` rurutis nulotos leludor 
leludor inideta nulotoser nulotosV  leludor 


“Leludor is nulotosed by inideta. (AFTF/AFTFTC) 


3.8 Step 7: Analyze words 


Now it is the hearer's turn. First it needs to determine which lexemes it thinks were 
intended (as the word forms may differ from their representation because of sloppy pro- 
nunciation). The agent looks for each form in both its verb and noun lexicon for entries 
that match best (in terms of the edit distance between the perceived and represented 
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forms). In order to determine the verb, for each word, the product of the verb match 
(verbScore) and the argument matches of the remaining words is calculated (nounScore), 
resulting in a verbEvidence score for that word. In Table 5 the results are shown for the 
analysis of (10b). The word that yields the best product is understood as the verb, which 
is (indeed) nulenod. 


Table 5: Identifying the verb in the AFTF utterance. 


form role vID vMatch  vScore nID  nMatch  nScore  verbEvidence 
leludor ? 236  leletad 0.02 43 leludor 1 0.00 
inideta ? 439  iniraru 0.04 50 inideta 1 0.00 
rurutis ? 205 runisum 0.04 916  rurutis 1 0.00 
nulotos verb 126  nulotos 100 690  nulenod 0.04 1.00 
leludor ? 236  leletad 0.02 43 leludor 1 0.00 


3.9 Step 8: Group words 


After identifying the verb, various groupings of the remaining words are possible. Anal- 
yses in which words are not assigned a function in the reconstruction under a given 
grouping analysis (either as an argument or a marker) are discarded. Given the situa- 
tion in Table 5, there are only two possible groupings. The second word could be a noun 
marker making the role of the first word explicit, or the third word could be the role 
marker of the second word (cf. Table 6). 


Table 6: Grouping possibilities for AFTF utterance 


grouping 1 grouping 2 
form role form role 
leludor argument leludor argument 
inideta role marker inideta argument 
rurutis argument rurutis role marker 
nulotos verb nulotos verb 
leludor cross marker leludor cross marker 


3.10 Step 9: Determine argument structure 


Next, for all possible groupings, the argument structure is determined. In all lineages it 
holds that “morphology” overrules heuristics such as AgentFirst and TypeMatch. That 


SEditing final characters is considered less costly than editing initial ones in this procedure. 
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is, arguments are assigned the predicate role with which their markers match best. The 
match between the verb roles of nulotos with the two presumed markers are given in 
Table 7. 


Table 7: Match between markers and predicate roles. 


verb role 
marker external internal 


inideta 74 75 
rurutis .94 .49 


When analyzed as a role marker, inideta cannot properly distinguish between the two 
roles, because of which the argument structure cannot yet be determined for the first 
grouping. Rurutis, however, does clearly mark the external role of nulotos. It now fol- 
lows logically that the other argument should have the internal role. Thus, the meaning 
arrived at given this second grouping is ‘Inideta nulotoses leludor. 

After failing to exploit the morphology in the first grouping, hearers of the AFTF and 
AFTFTC lineage now use the AgentFirst principle to arrive at the interpretation Leludor 
nulotoses rurutis. Note that they cannot use TopicFirst as an interpretation heuristic, as 
this says nothing about the predicate role. Instead, the hearers assume that if the first 
argument were not the agent, a speaker would have made this explicit in Step 5. 


3.11 Step 10: Identify target event 


For each of the interpretations for the various groupings, the hearer now determines 
which of the events in the situation matches best by comparing the vector representa- 
tions of the words with those of the properties of the corresponding referents in the 
events. Each interpretation is then linked to the event it describes best overall, and the 
interpretation that results in the combination with the highest score is considered the 
correct one. Thus, the overall best match of the interpretation ‘Inideta nulotoses leludor” 
is 2.81 with the 16th event in the situation (in which the verb semantics match perfectly, 
the external argument has a referential match of .85, and the internal argument has a ref- 
erential match of .96). As the best event match of the interpretation leludor nulotoses 
inideta' is 2.38 only, the former interpretation is preferred. This interpretation does in- 
deed lead to the target event the speaker was trying to point out, hence communication 
is successful. 


3.12 Step 11: Update numbers 


If communication is successful, i.e. if the hearer identifies the target event, both agents 
update the frequency scores in their lexicons. In this, they distinguish between overall 
and relative frequency. That is, separate scores are kept of the net use of words as referen- 
tial expressions, noun or verb markers. If, for example, a word is used as a noun marker, 
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both its total and noun-marker frequency go up by one point, while the argument and 
verb-marker scores go down by one (with a minimum of zero). 

In addition, agents add the words used to their usage history, together with the event 
ingredient to which they referred or the verb role they marked. Thus, for the noun inideta 
it will be remembered that it has been used for a 1.00 0.00 0.00 1.00 0.00 0.125 0.5 0.875 
1.00 (cf. (5) and (6)). Finally, if there was a newly introduced argument noun in the utter- 
ance, this is added to their mutual common ground, on the basis of which a situation is 
generated for the next turn. 


4 Modeling language change 


Although it is now explained how agents talk with each other, it still needs to be shown 
how language can change and how argument-marking strategies can develop in the 
model. For this, grammaticalization principles as proposed by Heine & Kuteva (2007) are 
implemented. As was shown in $3.2, initially all words are fully specified semantically 
and have a word length of 7 characters. Over time, however, words can desemanticize 
and erode. 

Erosion results from frequent use, either in terms of absolute frequency or in specific 
combinations. As said in $2.5, forms are pronounced sloppily if they are frequently and 
recently used or predictable (Jurafsky et al. 2001). Sloppy pronunciation does not lead to 
a change of lexical representation for the agent using the form. If a (younger) agent does 
not have a firmly established representation yet, however, it will adapt its representation 
on the basis of what it hears as a result of which word length may change over time. Thus, 
rurutis may become rurutir, and eventually ru. Erosion stops if a form is two characters 
long. 

If forms become too light to be used independently, they are suffixed to the preceding 
word in the utterance. In case of noun markers, this is of course the argument whose 
role they make explicit. The phonological weight of a letter is simply implemented by 
considering its rank in the alphabet, distinguishing between vowels and consonants: a 
and b cost one point, e and c two, etc. The production effort of a word is then calculated 
by adding up the ranks of its constituent letters. If the production effort of a word falls 
below 15, it becomes a suffix. 

In addition, a word may extend its denotation range incidentally (due to the lack of 
a better expression altogether or because a better matching expression is not necessary 
given the context). Eventually, this extension may become a standard part of a word's 
meaning, as a result of which it becomes more general. In the model, such desemanticiza- 
tion involves the progressive removal of the meaning dimensions of a word along which 
most deviation from the lexical representation is found in the usage history. Deletion 
takes place after certain frequency thresholds have been reached. For a first dimension 
to be removed, a word has to be used in 1% of utterances. This proportion increases ex- 
ponentially to 30% of utterances for the last dimension to be removed. Following Bybee 
(2010), desemanticization can occur within a single generation. 
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Note that frequently used words are likely to appear in a larger variety of contexts (as 
variation requires variables). Thus, frequent words can be expected to be more prone to 
desemanticization. Moreover, as both high frequency and light semantics lead to higher 
activation (i.e. precedence in the evaluation of candidate expressions; cf. the discussion of 
word selection in Step 3), the grammaticalization process starts a positive feedback loop. 
More general words can be applied in an larger variety of contexts, because of which they 
become even more frequent, because of which they desemanticize even further, because 
of which they are more often considered, etc. 

In this process, there is a crucial difference between words that are used as markers and 
those that are used as referential expressions: To refer to things in the world, it is often 
necessary to use explicit terms to distinguish the intended referent from the distractors in 
the situation. Thus, after being considered as a referential expression, many top-ranked 
words will be discarded for the very same reason they were considered first: they apply 
to too many things and thus they are not distinctive enough. But for both role and cross 
markers, there is always only one distractor: either the other predicate role or the other 
argument. This means that markers need not be very explicit, and therefore that in many 
cases, "light" expressions suffice. Thus, markers have a much easier time maintaining 
their positive feedback loop, which allows them to develop their characteristically short 
form and general meaning. 


5 Simulating the development of differential case marking 


The evolution of eight different lineages combining the different marking strategies in- 
troduced in 83.5 is studied over 56 generations. The set of strategies used by a lineage 
can be derived from its name, e.g. the lineage AFTC combines AgentFirst and TypeCast. 
CheckSuccess is always present, and all other model parameters are kept constant (cf. 
the appendix). 


5.1 Communicative success 


First consider the success rates of the different lineages over time in Figure 1. For this, a 
ninth lineage, TM, is added in which no marking strategy whatsoever is used. TM speak- 
ers simply produce the selected referential expressions and hope that their hearers can 
derive the distribution of roles using type matching. Thus, this lineage establishes a base- 
line of communicative success given the “predictability of the world”. As was shown in 
82, events are created on the basis of the meaning representations of the agents, meaning 
that many utterances need no additional marking: If a book and a woman are involved 
in a reading event, it is obvious who is doing what. Because of the noise that is added, 
however, things are not always this clear. The noise level and therefore the world pre- 
dictability is the same for all lineages. 

All lineages initially score well above the TM baseline of roughly 85%, and manage 
to communicate events (almost) completely successfully throughout. Note that the com- 
bined use of TopicFirst and AgentFirst in the AFTF lineage leads to negligible decrease in 
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communicative success if any. Although, as explained above, TopicFirst impairs the func- 
tionality of AgentFirst whenever the Actor happens not to be the topic, such utterances 
are "repaired" by CheckSuccess. 


2j SRERGLAASERORER OGRE ii iii itiiti iiidid i iiit SISO ek eet E ca 
" oe 
x AFTF 
A A di M ió o TC 
34 v AFTC 
2 e € TFTC 
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0 10 20 30 40 50 60 
generation 


Figure 1: Success rates over time per lineage. Lineage names are abbreviations 
of the marking principles included (AgentFirst, TopicFirst, TypeCast). The solid 
line marks the baseline of communicative success given the predictability of 
the world as evidenced by the TM lineage; CS uses CheckSuccess only. 


5.2 Profiles of most frequent words 


The goal of the simulation was to see under which conditions differential argument mark- 
ing emerges. This section will show the profiles of the three most frequently used words 
in the different lineages after 56 generations, and contrast these with their original rep- 
resentations to see whether they developed into case markers. In natural language, case 
markers can be characterized as frequently used words with maximally short forms and 
a very general meaning that mark the relation of an argument with its head. The model 
equivalents will be recognized as such if they are used to mark the semantic/perspectival 
role of their host, are specified for a few semantic dimensions only, and have eroded to 
the extent that they have to be suffixed. 

In the tables below, semantic weight (semWeight) is the proportion of dimensions 
that is still specified out of the maximum of nine. Production effort (prodEff) shows the 
production effort of the lexemes. The frequency column (freq) shows the total successful 
use frequency; the arg, noun, and verb variants show net use as argument, noun marker, 
and verb marker respectively. 

Let us first consider the lineages in which only one marking strategy is active (beyond 
CheckSuccess). Since AgentFirst is a perfectly viable argument marking strategy, no case 
marking is expected to develop. Indeed, in the AF lineage words are used as referring 
expressions only, as shown in Table 8. 
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As explained in $4, most situations require a rather specific expression to distinguish 
a target object from its distractors, meaning that it is rather hard for a referential expres- 
sion to grammaticalize. Nevertheless mamesut/mames developed into what one could 
call a pronoun. It lost four meaning dimensions (its semantic weight is 5/9) and 12 pro- 
duction points (going down from 27 to 15), and it is used in more than 10% of utterances. 


Table 8: Representations of three most frequent words of the AF lineage. First 
block: first generation; second block: representations after 56 generations. 


ID form freq argFreq nounFreq verbFreq prodEff semWeight 


624  mamesut 3 3 0 0 27 1 
890 anedume 0 0 0 0 18 1 
216  dadutin 3 3 0 0 22 1 

ID form freq argFreq nounFreq verbFreq prodEff semWeight 
624 mames 411 411 0 0 15 0.56 
890  anedi 70 70 0 0 11 0.89 


216  dadut 69 69 0 0 15 0.89 


At the other extreme, there is the TypeCast strategy, which uses role marking even if 
not strictly necessary for communicative success. If anywhere, case marking is expected 
to develop here. Indeed, rilamos/rid was already found to be a convenient marker in the 
first generation, quickly losing four meaning dimensions (cf. Table 9). After 56 genera- 
tions, only three dimensions remain. Also, it lost 15 production points, with the result 
that it now has to be suffixed to its host. 


Table 9: Representations of three most frequent words of the TC lineage. 


ID form freq argFreq nounFreq  verbFreq prodEff semWeight 
374  rilamos 713 1 511 0 24 0.56 
342  omasusu 116 44 0 0 30 0.89 
681  onodato 13 11 0 1 25 1 

ID form freq argFreq nounFreq  verbFreq prodEff semWeight 
374 rid 1517 0 1121 0 9 0.33 
342 omad 388 303 0 0 9 0.56 
681 ono 155 129 0 0 12 0.67 


A typical example of the usage of rid is given in (11) . As the remaining meaning 
dimensions have a value of zero and high numbers were understood as prominent (cf. 
§3.2), ri(d) is glossed as an undergoer marker. 
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(11) omad atilomi-ri lematim 
3A atilomi-U  lematim.V 


‘It lematims atilomi? 


Like mames in the previous lineage, omad and ono could be considered pronouns. The 
remaining dimensions of the former all have a value of 1, hence the gloss as Actor in (11); 
four out of five dimensions of ono are zero, which could therefore be considered the 
object pronoun. 

Note that the noun marker rid has grammaticalized much further than these referen- 
tial expressions. Recall from 84 that this is expected indeed. Whereas role markers only 
have to be minimally distinctive (given only one distractor role), referential expressions 
have to distinguish between dozens of distractors. 

Since noun marking is only used in case of a typing mismatch, we can easily create a 
minimal pair in which the internal argument is better qualified for its role and hence no 
marking is necessary. The contrast between (11) and (12) shows the differential nature of 
the marking system: 


(12) omad  isosisi lematim 
3A isosisi lematim.V 


‘It lematims isosisi. 


Also in the CS lineage role markers are used, albeit less frequently for reasons ex- 
plained in 83.5. And here too, case markers eventually develop. As shown in Table 10, 
the most frequently used word, unatoru/una, is mostly used as a noun marker and lost 
five meaning dimensions and 21 production points, with the result that it can only be 
used as a suffix. 


Table 10: Representations of three most frequent words of the CS lineage. 


ID form Deg argFreq nounFreq  verbFreq  prodEff semWeight 
237  unatoru 102 1 44 0 31 0.89 
940 donuran 5 2 1 0 24 1 

69  damumil 4 4 0 0 18 1 

ID form Deg argFreq nounFreq  verbFreq  prodEff semWeight 
237 una 715 0 314 0 10 0.44 
940 doni 243 235 0 0 12 0.67 

69 dami 49 49 0 0 8 0.89 


A typical example of the usage of un(a) is given in (13). As the remaining meaning 
dimensions again all have a value of zero, it is glossed as an undergoer marker. 
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(13) udeloto | dosotum-un rodones 
udeloto dosotum-U  rodones.V 


“Udeloto rodoneses dosotum" 


In the CS lineage, noun marking is only used when the role distribution does not 
follow automatically. Thus, if a minimal pair is created in which the external argument 
qualifies better for its role than the internal argument, no marking is necessary: 


(14) dadesad  dosotum rodones 
dadesad dosotum  rodones.V 


“Dadesad rodoneses dosotum 


The final single-strategy lineage is TF. The results should be similar to those of CS, 
since TopicFirst is not an argument-marking strategy proper. Indeed, a case marker again 
develops, viz. etamo/eta, which desemanticizes further than the pronoun rilelod/rid (Ta- 
ble 11). Note that since preposed topics are cross-referenced by anaphoric expressions, in 
this lineage verb markers are frequently used too for the development of indexing (see 
Lestrade 20152). 


Table 11: Representations of three most frequent words of the TF lineage. 


ID form freq argFreq nounFreq  verbFreq  prodEff semWeight 
22  etamamo 35 0 33 0 21 1 
597  iridono 14 0 0 14 24 1 
791  rilelod 126 21 38 0 19 0.89 
ID form Deg argFreq nounFreq verbFreq  prodEff semWeight 
22 eta 1003 0 641 0 10 0.44 
597 ira 424 0 0 420 9 0.89 
791 rid 329 287 0 0 9 0.56 


Combinations ofthe different strategies lead to predictable results. When any marking 
strategy is combined with TypeCast (AFTC, TFTC, and AFTFTC), case markers emerge, 
and when TopicFirst is used, verb markers are frequently used too. The results are shown 
in Table 13, Table 14, and Table 15. 

The only lineage whose results cannot be predicted straightforwardly is AFTF. In prin- 
ciple, TopicFirst may interfere with AgentFirst, meaning that role markers are sometimes 
necessary. However, since the actor is much more likely than the undergoer to become 
the topic, in most cases AgentFirst can still be used. In the present setup, the odds for 
the actor vs. the undergoer to become the topic were kept constant at 5:1. As shown in 
Table 12, case markers apparently do not develop under these conditions. 


501 


Sander Lestrade 


Table 12: Representations of three most frequent words of the AFTF lineage. 


ID form argFreq  nounFreq  verbFreq ` prodEff semWeight 
727  memunus 9 0 0 28 1 
693  alenidu 3 0 0 18 1 
641  osoranu 11 0 0 29 1 

ID form argFreq  nounFreq  verbFreq ` prodEff semWeight 
727 memun 434 0 0 17 0.44 
693  alenida 0 0 234 14 0.89 
641 osa 15 1 0 11 0.78 


Table 13: Representations of three most frequent words of the AFTC lineage. 


ID form argFreq  nounFreq  verbFreq  prodEff semWeight 
356  tisosar 0 448 0 32 0.67 
965  onedera 30 0 0 19 1 
372  mulirol 0 3 0 24 1 

ID form argFreq  nounFreq  verbFreq  prodEff semWeight 
356 tid 0 912 0 11 0.33 
965 ona 238 0 0 9 0.67 
372  mulil 60 0 0 15 0.89 


Table 14: Representations of three most frequent words of the TFTC lineage. 


ID form argFreq  nounFreq  verbFreq  prodEff semWeight 
206  esusiti 0 428 1 32 0.56 
6  inomola 5 0 0 21 1 
998  irutide 12 0 0 26 1 
ID form argFreq  nounFreq  verbFreq  prodEff semWeight 
206 esa 0 1129 0 9 0.33 
6 ina 0 0 230 8 1 
998  irun 227 0 0 17 0.67 
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Table 15: Representations of three most frequent words of the AFTFTC lineage. 


ID form freq argFreq nounFreq  verbFreq prodEff semWeight 
915  esodine 649 3 473 0 22 0.67 
584  olalune 2 2 0 0 20 1 
341 osutula 9 1 3 0 30 1 


ID form freq argFreq nounFreq  verbFreq prodEff semWeight 


915 esa 1447 0 903 0 9 0.33 
584 ola 352 0 0 342 7 0.89 
341  osur 245 237 0 0 20 0.67 


6 Discussion 


In this paper it was shown how differential case marking emerges in artificial languages 
in which, initially, only lexical expressions and very general communicative principles 
are used. This section discusses the main findings and implications, plus some limitations 
that should be taken into account. 

The results suggest that formal means of argument marking, and perhaps grammar 
more generally, need not be inherent to the language system. Over time, nothing changes 
in the (cognitive/biological) makeup of the agents. Instead, the language itself adapts to 
its usage, in a process of cultural evolution (Smith & Kirby 2008; Christiansen & Chater 
2008). As a result of grammaticalization, popular marking solutions become less costly to 
produce and irrelevant meaning dimensions are removed from their lexical representa- 
tions. Thus, case markers eventually develop; i.e. maximally short forms with maximally 
general meanings that mark an argument for its relation with its head. Importantly, the 
development of these more grammatical means of expression did not improve or dimin- 
ish communicative success. In fact, events were communicated successfully throughout 
the process. 

Although it was not shown here, as the model does not yet allow for this, it is easy to 
imagine how differential case markers extend their domain of application even further 
to become obligatory for all subject or object arguments. Then, it is no longer evaluated 
whether a marker is necessary to mark a role for a specific argument, but it is used 
simply to mark that role for any argument (resulting in functional overkill, cf. Durie 
1995). The only way to get from the former to the latter, at least in the present model, 
is when speakers make the generalization that not only deviant arguments are marked, 
but any argument. Interestingly, unlike the general assumption in the literature, this 
would mean that wholesale marking is the special, derived case rather than the default 
as indeed argued for by Sinnemäki (2014). Of course, this does not mean that in still later 
stages, case marking may not be lost again. 
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Finally, the simulations show a crucial difference between the grammaticalization po- 
tential of markers and referential expressions. Whereas the latter often have to be fairly 
specific in order to distinguish the intended referent from a large number of distractor 
objects, for markers there is always only one distractor: either the other predicate role 
or the other argument. As a result, general expressions more often suffice as markers, 
which means that they can be used much more frequently, because of which they gram- 
maticalize even further. 

Using a computer model implies that there are some obvious limitations to the present 
study too. An attempt was made to parameterize as many assumptions as possible (cf. 
the appendix). As the model is rather comprehensive, however, the full parameter space 
cannot easily be explored. For example, in the simulations it was assumed that the ac- 
tor was much more likely to be the topic, with the result that AgentFirst could still be 
used. This seems to make sense, as Comrie (1989) found that the two do indeed generally 
align. Still, it may be interesting to further explore the interplay between TopicFirst and 
AgentFirst. Some other assumptions are fundamental to the model. For example, the only 
source for markers in the model is the nominal lexicon, whereas for example in Chinese, 
the differential object marker ba derives from a verb (Yang 2008: 22). 
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Appendix: (Relevant) model parameter settings 


#dimensionality and distinctionality of meaning representations 
distinctions=c(2,2,2,2,2,9,9,9,9) 

#initial word length 
wordLength=7 

#alphabet 
vowels=c(’a’,’e’,’i’,’0’,’u’); consonants-c('d', 'l','m','n','r','s','t") 

#lexicon size 
nNouns=999; nVerbs=499 

#preference for the external role to combine with higher values 
oddsLinkingHierarchy=2 

#amount of referential noise (0--1) 
referenceNoise=.2 

#amount of noise in role assignment 
roleNoise=.5 

#maximum number of events that are ongoing in speech situation 
nEvents=30 
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#preference for actor, \isi{undergoer} and event to be the topic 
roleTopicality=c(100,20,1) 
maximum number of turns conversations consist of 
nTurns=30 
#protoprinciples 
checkSuccess=T; solutionMethod=’secondArgument’ 
typeCast=F; castingThreshold=.7 
agentFirst=F 
topicFirst=F; topicCopy=T 
orderAgentFirstTopicFirst='TA” 
#reduction/change 
reductionFrequencyThreshold=20 
reductionCollostructionThreshold=5 
reductionRecencyThreshold=3 
suf fixThreshold=15 
distinctiveness=.05 
erosion=T, 
formSetFrequency=3, 
erosionMax=2 
desemanticization=T 
desemanticizationThreshold=.01 
desemanticizationCeiling=.3 
minimalSpecification=2 
desemanticizationMethod=’ variance’ 
#life 
deathAge 
procreationAge=.75 
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Reassessing scale effects on differential 
case marking: Methodological, 
conceptual and theoretical issues in the 
quest for a universal 
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It is widely believed that when differential case marking depends on the referential proper- 
ties of the NP in question, it is governed by a well-defined hierarchy or scale of referential 
categories, and that the resulting systematicity is one of the most robust generalizations in 
linguistic typology. This view has recently been called into question, with Sinnemáki (2014) 
and especially Bickel, Witzlack-Makarevich & Zakharko (2015) claiming that there is now 
firm typological evidence against such universal scale effects. Since these papers are based 
on the largest world-wide databases compiled so far, their results are likely to be taken as the 
current state of the field. In the present paper, we re-examine Bickel, Witzlack-Makarevich 
& Zakharko's (2015) data from a different perspective and re-evaluate their negative conclu- 
sions: First, we complement their analysis in terms of diachronic "family biases" by a more 
direct inspection of the raw data and an alternative statistical model, both of which afford 
a clearer understanding of where and how exactly the predicted scale effects are violated. 
Proceeding from this, we argue for the existence of universal scale effects on case mark- 
ing, and we embed this argument in a more general discussion on current methodological, 
conceptual and theoretical issues in postulating these effects. 


1 Introduction 


An important discovery of typological research is that differential argument marking 
(DAM) is systematically related to what we may call the “referential properties" of the 
argument in question. As outlined and exemplified in the introductory article to the 
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present volume, these comprise animacy, definiteness, specificity, nominality, person, 
kinship and discourse-pragmatic prominence (e.g. topicality). In comparative research 
since Silverstein (1976), it has been argued that contrasts in referential properties (e.g. 
animate-inanimate) can be arranged into an implicational hierarchy or scale that predicts 
asymmetries in argument marking.! Two versions of this referential scale are given in 
(1?, and their classic predictions for case marking follow in (2): 


(1) a. “extended animacy hierarchy" (Croft 2003: 130) 


1,2 Pro > 3 Pro > proper noun > human common noun > non-human animate 
common noun > inanimate common noun 


b. “individuation scale" (Lazard 1998: 220) 


pronoun > human definite > human indefinite/nonhuman definite > 
nonhuman indefinite > indefinite non-specific 


(2 a. Ifa P argument is unmarked for case for a given referential category in (1a) 
or (1b), it will also be unmarked for case for all categories to the right. 


b. If an A argument is unmarked for case for a given referential category in (1a) 
or (1b), it will also be unmarked for case for all categories to the left. 


The generalizations in (2) have also been referred to as "scale effects" (Bickel, Witzlack- 
Makarevich & Zakharko 2015, henceforth BWZ) or “referential effects" (e.g. van Lier 
2012) on the distribution of overt case marking. With the compilation of large cross- 
linguistic databases, it has recently become possible to subject these generalizations to 
thorough empirical evaluation. And so far, the resulting assessments have been strikingly 
negative: Thus both Sinnemáki (2014) and BWZ identify some clear areal signatures of 
DAM in case marking, so that the effect might be "first and foremost a pattern prone 
to diffusion" (BWZ: 40). When controlling for such areal dependencies, Bickel and his 
collaborators have argued that there is no evidence for universal effects of the person 
scale on indexation (Bickel, Witzlack-Makarevich, Zakharko € Iemmolo 2015; Witzlack- 
Makarevich et al. 2016) and that there is, in fact, direct "evidence against universal effects 
of referential scales on case alignment" (cf. the title of BWZ). 

Importantly in the context of the present volume, Bickel, Witzlack-Makarevich & Za- 
kharko's (2015) assessment is based on the estimation of diachronic "family biases" from 
synchronic data (Bickel 2011; 2013). In a nutshell, the argument is that when language 
families produce new generations of offspring, they do not systematically develop into 
the directions predicted by (1) and (2): Some families are internally diverse with regard 
to these predictions, and among those that are significantly biased towards certain scale 
effects on case marking, there is always a substantial number of families that are biased 


The term “asymmetric” is adopted from de Hoop & Malchukov (2008) and refers to the kind of differential 
argument marking in which an overt case exponent alternates with zero marking. We will return to the 
notion of *markedness" (and a different way of operationalizing asymmetric case marking) in 82.1 below. 

?Further incarnations of the same idea include, for example, Comrie's (1981) *animacy hierarchy", De- 
Lancey's (1981) “empathy hierarchy", Bickel's (1999) “indexability hierarchy" and Shibatani's (2006) *rel- 
evance hierarchy". 
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in the opposite direction. In other words, BWZ's finding is that the predictions in (2) are 
violated too often to qualify as a principle that universally guides the diachronic devel- 
opment of language families.? It is not our purpose in this paper to take issue with this 
specific method. However, given that BWZ's conclusion challenges of one of the most 
prominent and widely cited generalizations in typology since the 1970s, we would like 
to discuss and expand the empirical assessment of scale effects on case marking. 

Specifically, we intend to do three things: Firstly, in the absence of actual diachronic 
data for most of the world's language families, the most direct evidence for typological 
patterns we have inevitably lies in the synchronic data themselves. Therefore, we would 
first like to be clear about the synchronic picture in its full extent. To this end, we begin 
(in $2) by complementing BWZ's analysis by a more direct inspection of the raw data, 
which lays bare where and how exactly the predicted scale effects are violated.* Secondly, 
given what is at stake, we feel that BWZ's assessment should be cross-validated by other 
contemporary statistical procedures for typological research, such as those proposed by 
Cysouw (2010) and Jaeger et al. (2011). We show (in 83) that these mixed-effects regres- 
sion methods yield robust synchronic evidence for the predicted scale effects on case 
marking. In view of this result, a more general discussion is in order about methodologi- 
cal, conceptual and theoretical issues in comparative research: To what extent are purely 
synchronic analyses justified? What does it take for an effect to be called “universal”, and 
what is the role of the referential scale in explaining differential case marking? In dis- 
cussing these matters, we question some specific assumptions made by Sinnemáki (2014) 
and BWZ , but also certain interpretations of the referential scale in formal-generative 
approaches to differential case marking. In 85, finally, we conclude the paper by summa- 
rizing our major points. Our study comes with several supplementary materials (SM1- 
SM4), which can be downloaded from the authors' websites?, as well as an Appendix at 
the very end of the paper. 


2 Dissecting the data 


2.1 Coding procedure 


BWZ examine a sample of 435 languages for referential effects on case marking, under 
which they subsume all kinds of morphology on verbal arguments, regardless of its fu- 
sion type (i.e. including adpositional flagging and non-concatenative signals of case) and 
its host (i.e. including markers that are limited to elements of the NP other than the noun 
itself, such as case on German determiners). The classic typological predictions with re- 
gard to such case exponents were given in (2) above, but we need to refine the notion 
of markedness at this point. The statements in (2) imply a difference between zero and 


5We provide some more information on the Family Bias Method in the Appendix. 

^We would like to thank Balthasar Bickel and his collaborators for making their entire data and their algo- 
rithms publicly available (cf. also Bickel et al. 2017). 

Sef. http://www.kschmidtkebode.de/publications.html and http://www.natalialevshina.com/publications. 
html. 


511 


Karsten Schmidtke-Bode & Natalia Levshina 


overt case marking, i.e. a contrast in coding material (as in Comrie 1981 or Croft 2003). 
BWZ, by contrast, frame the predictions in terms of more abstract grammatical relations 
(as in Silverstein 1976): Low-ranking P arguments (and high-ranking A arguments) are 
predicted to preferably establish an unmarked grammatical relation, while high-ranking 
P arguments (and low-ranking A arguments) are predicted to map onto a marked gram- 
matical relation. BWZ take an unmarked grammatical relation to be an alignment set 
that also includes other syntactic functions beside the one at issue, notably the S role of 
intransitive clauses: For example, a case formative that applies to (and hence aligns) S 
and P defines an {S=P} set, while a marker that does not distinguish S, A and P defines a 
yet more general (S=A=P) alignment set. On this view, case formatives that exclusively 
target (P) or {A} define very narrow, thus more specific and hence structurally marked, 
sets. 

The crucial question, then, is whether P arguments with higher referential prominence 
(and A arguments with lower prominence) tend to occur in such marked alignment sets. 
We can illustrate this on the basis of case marking in Chantyal (Sino-Tibetan, Bodic: 
Nepal), also discussed by BWZ as a representative example of their coding procedure: 
Speakers of Chantyal consistently mark A arguments by Ergative case and consistently 
code S by a zero Absolutive. P arguments are split in such a way that pronouns and 
human NPs always receive overt Dative case, while non-human NPs typically go in the 
unmarked Absolutive, just like S. However, the marking for non-human NPs actually 
depends on the degree of empathy felt towards that entity, so that the precise point at 
which the referential scale is cut off is not easy to determine. At any rate, though, it is 
clear that the higher-ranking P arguments define a narrow, marked alignment set {P}, 
while the lower-ranking P arguments are mapped onto a more general alignment set 
{S=P}, and not the other way around. A arguments consistently define a narrow set {A}, 
ie. they are not split to begin with. 

In Table 1 below, the facts about Chantyal are represented in BWZ's coding format. 


Table 1: Coding in Bickel, Witzlack-Makarevich & Zakharko (2015) 


Language Family Macro Referential Sub- A P Alignment 
continent condition system 

Chantyal Sino- Eurasia N-high NA marked marked S|A|P 
Tibetan 

Chantyal Sino- Eurasia N-low NA marked unmarked ` S-P|A 
Tibetan 

Chantyal Sino- Eurasia Pro NA marked marked S|A|P 
Tibetan 


Table 1 displays the three referential conditions that are relevant to case marking in 
Chantyal, summarizes the alignment pattern in each condition and specifies, for both A 


Din reference to animals, for example, one can contrast ‘I killed the chicken-@’ with ‘I cut the chicken-DaT 
[so that it bled]’, cf. Noonan (2003). 
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and P, whether they establish a marked or an unmarked alignment set in the given refer- 
ential condition.” The contrast between N-high and N-low captures the above-mentioned 
fact that a more specific referential contrast (such as animate-inanimate) is difficult to 
establish. 

Having clarified the basic coding procedure in BWZ, we can now examine the data 
with regard to the case splits they contain. To this end, the following subsections will 
take a closer look at the effects of the most important referential dimensions coded in the 
data. In other words, we here first inspect the effects of individual referential properties 
that are included in hierarchies like (1), such as animacy or person, before we examine 
the combined effect of these dimensions in $3. Our major goal for the moment is thus to 
provide typologists with an idea of how numerous the exceptions to well-known refer- 
ential subscales are and where these are located, i.e. which languages and stocks show 
which kinds of counterexamples. Although some of the relevant scales are also tested 
by BWZ, they do not provide the kind of “raw” information we present here, so the 
following data can be seen as complementary to the statistical analysis offered by BWZ. 


2.2 The global picture 


The overall distribution of differential case marking is nicely laid out in BWZ (pp. 24-31), 
especially from an areal perspective. We will discuss the areal patterns in $4 and hence 
confine ourselves to the overview of the data given in Table 2.8 


2.3 High-low distinctions: Animacy, definiteness, topicality and the 
like 


Perhaps the best-known kinds of case-marking splits are controlled by animacy (as 
in Armenian (Indo-European) or Gurung (Sino-Tibetan)), definiteness (as in Amharic 
(Semitic), Brahui (Dravidian) or Barasano (Tucánoan)), specificity (as in Persian (Indo- 
European) or Udihe (Tungusic)), kinship (e.g. Gumbaynggir (Pama-Nyungan) or unique- 
ness (proper versus common nouns (e.g. Gitksan (Tsimshianic)). lemmolo (2010), among 
others, additionally points to the importance of topicality in inducing case splits. Over- 
all, such contrasts are relevant to 83 cases (= 60%) of all P-splits and 7 cases (= 12%) of 


"The column “Subsystem” does not apply to Chantyal and is hence coded as “not applicable (NA). In other 
languages, it captures situations in which the case-marking system is sensitive to other structural factors, 
such as the difference between main and dependent clauses, periphrastic and synthetic verb forms, etc. 
Each of these conditions is then evaluated separately with regard to whether case marking also interacts 
with referential properties of the NP and which alignment sets result. The overall number of case-marking 
(sub)systems (N - 462) is thus somewhat higher than the number of languages in BWZ's sample (N - 435). 
Additionally, it should also be noted that BWZ concentrate on what they call *default verb classes" in their 
paper, disregarding, for instance, the case marking and alignment of experiencer NPs; in other words, their 
focus is on canonical transitive and intransitive clauses. 

$The counts presented in Table 2 differ very slightly from BWZ's original ones: First, we break up BWZ's 
"Other" area into Africa and the Americas, in order not to lose this kind of information coded in the data; 
this holds for all analyses to follow in this paper. Second, BWZ's Table 5 on P-marking fails to list Máku, an 
isolate of South America. Conversely, our own analysis discards Hindi, for which the original coding was 
complicated by multiple subsystems with overlapping referential categories that did not allow a straight- 
forward reanalysis. 
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Table 2: Overview of P- and A-splits in the data 


Macro- Family Split Macro- Family Split 
continent systems continent systems 
P A P A 
Africa Adamawa-Ubangi 1 Americas Arawakan 1 
Benue-Congo 2 Barbacoan 2 
Chadic 2 Haida 1 
Cushitic 2 Macro-Ge 1 
Indo-European 1 Máku 1 
Kwa 1 Nadahup 1 
Omotic 2 Pano-Tacanan 1 1 
Semitic 1 Pomoan 1 
South Atlantic 1 Siouan 1 
Eurasia Austroasiatic 1 ee 8 
Se Tsimshianic 1 
Dravidian 7 , 
Tucanoan 4 
Indo-European 31 15 
Uto-Aztecan 
Kusunda 1 à 
i Zuni 1 
Mongolian 4 
Nakh-Daghestanian 1 3 Sahul Austronesian 1 
Semitic 1 Awyu-Dumut 1 
Sino-Tibetan 13 8 Kalam 1 
Tungusic 1 Madang 1 
Turkic 7 Mangarayan 1 1 
Uralic 3 Mirndi 1 
Oksapmin 1 
Pama-Nyungan 26 29 
Tangkic 1 
Timor-Alor-Pantar 3 


all A-splits. In BWZ's study, the dimensions of animacy, definiteness, specificity, kin- 
ship and uniqueness are recorded as such in the database, while discourse-pragmatic 
and other language-specific contrasts (cf. Chantyal above) are coded as a more general 
Nhigh-Nlow contrast. For purposes of statistical testing, all of these dimensions can be 
conflated into a ProNhigh > ProNjow scale.? In Table 3 below, we have compiled the data 
that are relevant to this scale and outline to what extent they are in keeping with the pre- 
dictions for P- and A-marking, respectively. In this and all following tables of the same 
sort, “fit” indicates that a given system fits the predictions of the scale in question and 
“vio” indicates that it goes against it. “NA” captures all languages that do not exhibit the 
relevant split. The figures refer to the number of languages, while the figures in brackets 
indicate the number of distinct families from which these languages come. Violations 
are additionally underlined. 


"The inclusion of pronouns on the scale is justified by the fact that the split between high and low referen- 
tial prominence may also (or even exclusively) affect pronouns and not only nouns (e.g. in Central Pomo 
(Americas), where this applies to the third person pronouns). 


514 


18 Reassessing scale effects on differential case marking 


Table 3: Systems with 'high-low' splits in case marking 


P-marking A-marking 
Eurasia Africa Americas Sahul Eurasia Africa Americas  Sahul 
fit 55(9) 3(3) 11 (7) 13 (4)  2(2) 0 (0) 0 (0) 4 (2) 
vio 0(0) 1(1) 0 (0) 0 (0) 0 (0) 0 (0) 1 (1) 0 (0) 
NA 15(4 or 8 (6) 23(6  24(3  0(0) 1(1) 27 (2) 


For P-marking, the splits virtually always work in the predicted direction, i.e. low- 
ranking nouns are structurally unmarked while high-ranking ones are marked. The only 
exception in the entire database is Sheko (Omotic), in which the distribution is reversed. 
In this language, we find an unspecified high-low contrast in the database; therefore, 
wherever the more concrete dimensions on animacy, definiteness and specificity are in- 
volved, there is no single counterexample to the predicted effects. For A-marking, the 
high-low distinction is much less relevant than for P-splits, so that the numbers are very 
small to begin with. Again, however, there is only a single exceptional language in the 
data: This is Gitksan (Tsimshianic: Americas), where common nouns are unmarked while 
proper nouns are marked, which is precisely the opposite of the predicted effect (un- 
der which specific marking, for example, should preferentially apply to lower-ranking 
A arguments). The effect from these referential dimensions is thus very robust cross- 
linguistically. 


2.4 Nominality: Splits between pronouns and lexical NPs 


A fundamental distinction on the hierarchies in (1), but also all of its further variants in 
the literature, is that between pronominal and lexical (i.e. full nominal) NPs. On all four 
macro-continents distinguished in Table 2, there are languages which reserve specific 
P-marking for pronouns and allocate their nouns to an unmarked alignment set (e.g. 
Yoruba, Gulf Arabic, Thayorre and many others). The opposite distribution would be 
expected for A-marking (e.g. Cashinahua or Yukulta). Overall, nominality governs 33 
cases (= 24%) of differential P-marking and 17 cases (= 29%) of differential A-marking. 
Apart from such “clean splits” between the two categories, one may, however, also adopt 
a broader view of the markedness distributions of pronouns and nouns: If, for example, 
a language exhibits a split of its pronouns but not its nouns, the question is whether the 
nouns join the marked or the unmarked alignment set (for P, the prediction would be 
“unmarked” while it would be “marked” for A). We can thus distinguish four scenarios 
in the data, and we provide the relevant figures for each of them in turn: 

Scenario A: A given case system makes a “clean” Pro-N distinction. As can be seen in 
Table 4, wherever this happens, there is not a single language going against the predicted 
direction of the split, neither for P- nor for A-marking 
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Table 4: Systems with “clean” Pro-N splits in case marking 


P-marking A-marking 
Eurasia Africa Americas Sahul Eurasia Africa Americas Sahul 
fit 7 (3) 4 (3) 5 (4) 17(5) 10 0 (0) 1(1) 15 (2) 
vio 0(0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 
NA 63(10 9(6) 14(10 19(6)  25(3 0(0) 1(1) 16 (2) 


Scenario B: A given case system partitions nouns into marked and unmarked subsets 
but does not divide up pronouns. There is not a single example of P-marking in which 
the pronouns join the unmarked set (Table 5). 


Table 5: Systems with splits in nouns but not in pronouns 


P-marking A-marking 
Eurasia Africa Americas Sahul Eurasia Africa Americas Sahul 
fit 46(9)  4(3) 8 (5) 10(4)  1(0) 0 (0) 0 (0) 2 (2) 
vio 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1(1) 0 (0) 
NA 24(4 or 11()  26(6) 25(3) 0(0) 10) 29 (2) 


As can be seen, there is one exceptional system for A-marking: This is Gitksan (Tsimshi- 
anic: Americas), in which common nouns are in an unmarked alignment set while proper 
nouns and pronouns are marked, i.e. we find exactly the opposite distribution from what 
is predicted for A-marking. 

Scenario C: Where systems partition pronouns into marked and unmarked subsets 
but do not divide up nouns, the data look as follows (Table 6). 


Table 6: Systems with splits in pronouns but not in nouns 


P-marking A-marking 
Eurasia Africa Americas Sahul Eurasia Africa Americas Sahul 
fit 7 (2) 4 (4) 1 (1) 7(3)  18(3 om 0 (0) 12 (1) 
vio 0 (0) 1(1) 1(1) 0 (0) 5 (1) 0 (0) 0 (0) 0 (0) 
NA 63(10 Sie 1701  29(8) 32 0 (0) 2 (2) 19 (3) 


For P-marking, the prediction is that nouns will join those pronouns that are found in 
an unmarked set, while the opposite is predicted for A-marking. Two languages violate 
this prediction for P-marking, namely Oromo (Cushitic) and Osage (Siouan). In Oromo, 
the unmarked set comprises all pronouns in the plural while singular pronouns and all 
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nouns receive P-marking; in Osage, all nouns and third-person pronouns are marked 
while SAPs are unmarked. For A-marking, we find five aberrant systems, all from Indo- 
European and specifically Iranian (Roshani and participial clauses in Khufi, Yazgulyámi, 
Tarom and Bartangi);? in all of them, nouns join an unmarked grammatical relationship. 

Scenario D: Where languages partition both nouns and pronouns into marked and 
unmarked alignment sets, this inevitably results in discontinuities between Pro and N 
on the referential hierarchy and hence in a violation of the Pro>N subscale. The relevant 
languages are shown in Table 7. 


Table 7: Languages with splits in both pronouns and nouns 


P-marking A-marking 


Eurasia Albanian, German, Vafsi and Qiang 
non-participial clauses in 6 
Iranian languages (Tarom, 
Shahrudi, Dimli, Kirmanjki, 
Kajali, Eshtehardi) 


Africa --- --- 

Americas Tsafiki, Tarascan, Maku, Central --- 
Pomo 

Sahul Kala Lagaw Ya, Gumbaynggir Kala Lagaw Ya, Yandruwandha 
(both Pama-Nyungan) (both Pama-Nyungan) 


The languages in Table 7 differ in how exactly they implement a Pro-N split, par- 
ticularly with regard to the distribution of individual referential categories within the 
pronouns (e.g. singular versus plural pronouns (Albanian), 2PL versus all others (Vafsi), 
1+2PL versus the rest (Eshtehardi/Dimli/Kirmanjki main clauses), etc.). Upon closer in- 
spection, however, it turns out that these rather idiosyncratic splits are largely confined 
to the Iranian languages in Table 7; moreover, there are some principled regularities 
again: Firstly, in all of the above languages, the nouns are split in such a way that they 
conform to the predictions of the Npigh>Njow scale, and this applies to both P- and A- 
marking. (Ihe only exception is German, where the split is according to different noun 
classes and not referential properties as such.) And secondly, pronouns and nouns may 
both be split according to the same principle, namely an animacy or definiteness contrast 
(e.g. Tsafiki, Tarascan, Máku and Central Pomo P-marking and Qiang A-marking); as a re- 
sult, high-ranking (animate, definite) nouns and pronouns are split off from low-ranking 
(inanimate, indefinite) nouns and pronouns, thus creating a discontinuity between Pro 
and N on the referential scale. The observed diversity, therefore, primarily resides in 
the way that specific person-number categories are organized, and we will turn to these 
presently. 


V'These Iranian languages are very closely related; in fact, Roshani, Khufi and Bartangi are sometimes con- 
sidered dialects of the Shughni language. Similar remarks apply to the Iranian languages which follow in 
Table 7. 
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2.5 Person-conditioned splits 


Differential case marking according to person-number constellations is attested for 29 
systems (= 21%) for P-marking and 32 systems (= 54%) for A-marking. In the following, 
we examine person splits separately for singular and non-singular (dual, plural) number, 
in order to capture the empirical picture as precisely as possible. Table 8 shows which 
person splits are attested in the singular. 


Table 8: Person splits in the singular (^ indicates the number of violating sys- 


tems) 
P-marking A-marking 
Eurasia Africa Americas Sahul Eurasia Africa Americas Sahul 
123 3(1) 10) 0 (0) 10)  4( 0 (0) 0 (0) 2) 
12-3 2(Q2) 1(1) 3 (3) 6(2% 7(3) 0 (0) 0 (0) 4 OT 
233 4()'7*  o(0) 0 (0) 0(0  3()"* 0(0) 0 (0) 2 OT 
NA  61(10) 11(9) 16(10) 29(9) 12(3) 0 (0) 2 (2) 23 (3) 


When languages show a 1-23 split, the predicted direction is 1223, e.g. a marked align- 
ment set for first-person P. The three Eurasian languages that feature this split for P 
(all Indo-European) uniformly behave in the predicted direction; in Tera (Chadic: Africa) 
and Teiwa (Timor-Alor-Pantar: Sahul), by contrast, this scale is violated (23-1). For A, 
three Eurasian languages (all Sino-Tibetan) fit the predicted direction while an Indo- 
European system (Tarom participial clauses) goes against it; the two Sahul languages 
are both Pama-Nyungan and show a violation and a fit, respectively. 

At least one taxon from each area exhibits a 12-3 split in P-marking, with one violation 
of the predicted direction in the Americas (Osage) and in Sahul (Teiwa). For A-marking, 
the only violation of the scale comes from the Pama-Nyungan language Alyawarra. For 
the singular, then, the 1223 scale looks more promising than the 1223 scale. 

What is more difficult to evaluate in terms of scalar predictions is languages that make 
a 2-13 split, as this split is not predicted by the common versions of the referential hi- 
erarchy. BWZ set out to test a hierarchy including 1>2>3 and one including 12>3. If we 
assume that both of these scales are violated by a 2-13 split, all of the languages in the 
third row of Table 8 above are problematic and hence constitute counterevidence to the 
implicational hierarchy in (1a); note that they all come from either Indo-European or 
Pama-Nyungan. 

In the non-singular (conflating plural and dual patterns here), the distribution of per- 
son splits is as follows (Table 9). 

As can be seen, systems with a 1-23 split, despite not being numerous, are consistently 
organized in the predicted direction, i.e. there is no violation of this scale this time (in 
contrast to what we saw for the singular above). For 12-3 splits, A-marking is also well- 
behaved without exceptions, while six Indo-European systems (all from closely related 
Iranian languages), and again Osage (Americas) and Teiwa (Sahul), violate the 12>3 scale 
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Table 9: Person splits in the non-singular (^ indicates the number of violating 


systems) 
P-marking A-marking 
Eurasia Africa Americas Sahul Eurasia Africa Americas Sahul 
1-23 0 (0) 0 (0) 1(1) 0(0 30 0 (0) 0 (0) 3 (1) 
123 8(QHéú 10m 2 (2) SOU  13(3) 0 (0) 0 (0) 6 (1) 
2-13 3(0D"* 10) 0 (0) 10  2()0" 0 (0) 0 (0) 0 (0) 
NA  59(10) 11(9)  16(10) 30(8) 8(3) 0 (0) 2(2) 22 (3) 


for P-marking. In the latter two languages, then, the violation of the 1223 scale applies 
to both singular and non-singular pronouns, whereas in Indo-European, the violations 
are confined to the non-singular. Finally, we also find some 2-13 splits again; apart from 
Indo-European (Vafsi, Chali (A- and P-marking), English (P-marking only)), these are 
now also found in Tsamai (Cushitic: Africa) and Tamambo (Austronesian: Sahul). 

The figures provided in this section are not directly comparable to BWZ's, as we ex- 
amine person effects for the two number categories separately while BWZ intended to 
home in on one referential dimension at a time (i.e. they tested the robustness of person 
scales regardless of the number distinction and vice versa). At any rate, however, it is 
clear that there is quite a bit of diversity with regard to the pronominal splits in question 
and in view of the small overall numbers and the amount and distribution of exceptions, 
no straightforward universal appears to emerge from eyeballing the data. The Family 
Bias estimations involving such person splits (cf. Table 16 and Table 17 in the Appendix) 
yield roughly as many biases in favour of each ranking as against it, and we will have to 
await our alternative statistical evaluation in $3 to see if the distributions are still robust 
enough to support the most widespread version of the referential scale, which comprises 
a 12>3 contrast (as in (1a)). 


2.6 Number-conditioned splits 


The final split in the data is one of number: According to Bickel's (1999) “indexability 
hierarchy", "singular and individualized referents are generally easier to point at un- 
ambiguously than groups or masses", suggesting that “in many languages, they figure 
higher on the indexability hierarchy" (Bickel & Nichols 2002: 225). Following this logic, 
Table 10-Table 12 below display how the data fit a potential sc>Nsc scale. Again, we 
do this separately for each person category and, in the third person, also separately for 
nouns and pronouns. 

Again, BWZ seek to assess the number scale as such, without the possible effects of 
cross-cutting person distinctions. In doing so, they roughly find at least as many viola- 
tions of the sc»Nsc scale as supporting taxa in all areas. The raw but more fine-grained 
data shown here are complex and suggest a different picture for P- and A-marking. For 
P-splits, the scale in question mostly (i.e. except for Sahul) receives more support than vi- 
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Table 10: Systems with sc>nsc splits in the first person 


P-marking A-marking 
Eurasia Africa Americas Sahul Eurasia Africa Americas Sahul 
fit 12 (1) 4 (3) 0 (0) 2 (2) 2 (2) 0 (0) 0 (0) 1(1) 
vio  0(0) 1(1) 0 (0) 100 u(t) om 0 (0) 10 (1) 
NA 58(1 8 (6) 19(13)  33(8  13(3) 0(0) 2 (2) 20 (3) 


Table 11: Systems with sc»Nsc splits in the second person 


P-marking A-marking 
Eurasia Africa Americas Sahul Eurasia Africa Americas Sahul 
fit 8 (1) 3 (3) 1 (1) 1 (1) 2 (2) 0 (0) 0 (0) 1(1) 
vio 0 (0) 0 (0) 0 (0) 0 (0) 9 (1) 0 (0) 0 (0) 6 (1) 
NA 62(1) 107)  13(12)  34(8  15()  0(0) 2 (2) 24 (3) 


olations (in raw counts), and it is even exceptionless in the second person. For A-marking, 
by contrast, there are consistently more violations than fits, yielding BWZ's family-bias 
results in Figure 17 (Appendix). In other words, there is clear evidence against the sc» Nsc 
scale for A-marking while the picture is less straightforward for P-marking. We leave 
the latter to be explored further by our own statistical model, which will be presented 
in the next section. 


3 Remodelling the data 


Now that we have a clearer idea of individual referential dimensions and their behaviour, 
we can test the robustness of a scale on which they are combined. Perhaps the best- 


Table 12: Systems with sc»Nsc splits in the third person 


P-marking A-marking 


Eurasia Africa Americas Sahul Eurasia Africa Americas Sahul 
fit. PRO 4 (1) 4 (4) 0 (0) 1 (1) 1(1) 0 (0) 0 (0) 0 (0) 
fit.N 1(1) 0 (0) 1(1) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 
vio PRO 1(1) 0 (0) 0 (0) 1(1) 4 (1) 0 (0) 0 (0) 2 (1) 
vio.N 0 (0) 0 (0) 0 (0) 2 (1) 0 (0) 0 (0) 0 (0) 0 (0) 
NA 6401  9(7) 18 (13)  32(9) 21(3) 0 (0) 2 (2) 29 (3) 
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known version of an extended referential hierarchy is the one that recognizes a dis- 
tinction between speech-act participants and third persons (12>3), a difference between 
pronouns and full nouns (Pro>N) and a high-low distinction among nouns (which may 
consist in animacy, definiteness, specificity, topicality and other contrasts). The resulting 
scale, which is also tested in BWZ, is given in (3) below: 


(3) 1,2 Pro > 3 Pro > Nhigh > Niow 


The relevant predictions for case marking are the previous ones in (2), bearing in mind 
that “markedness” is defined in terms of alignment sets. For reasons of space and the 
small number of data points, we will have to confine ourselves to DOM here and exclude 
differential A-marking from testing. Following BWZ, we will perform two different kinds 
of statistical evaluation, viz. a conceptually simpler type model in $3.1 and a somewhat 
more complex rank model in 83.2. 


3.1 Type-based modelling 


The basic question in this kind of model is whether the systems that fit the scale in (3) 
significantly outnumber the systems that violate it, while controlling for genealogical 
and areal dependencies. The critical issue, therefore, is whether each of the 137 split-P 
systems in the data is considered a fit to or a violation of (3). In order to be maximally 
cautious, any kind of violation on the following subscales of (3) resulted in the system 
being coded as *violating": 


e Nominality: If a language has a “clean” Pro-N split, it fits (3) if the pronouns are 
marked while the nouns are unmarked; the opposite pattern is a violation. If a 
language splits only its pronouns, it fits (3) if the nouns join the unmarked sets of 
pronouns; the opposite pattern is a violation. If a language splits only its nouns, 
it fits (3) if the pronouns join the marked set of nouns; the opposite pattern is 
a violation. If a language splits both its nouns and its pronouns, it counts as a 
violation (cf. our comments in 82.4 above). 


e Nnigh-N low: All splits according to animacy, definiteness, topicality, kinship and 
uniqueness are subsumed under the Nhigh>Njow distinction (just as in BWZ's test). 
Since these are usually binary contrasts, they fit the scale in (3) if higher nominals 
are P-marked while the lower ones are not, while the opposite situation is a viola- 
tion of (3). 


e Person: 


- Ifa language shows a 12-3 split in its pronouns, it fits (3) if speech-act partic- 
ipants (1,2) are marked while 3 is unmarked; the opposite pattern is a viola- 
tion. 


- Ifa language shows a 1-23 split, it can be considered a “partial fit” if it takes 
the direction of 1>23 (i.e. with first person being marked and the others un- 
marked); in that case, it arguably does obey the proposed 1>3 ranking, while 
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it does not make a distinction between 2 and 3. If the direction of the split is 
23>1, it counts as a violation of (3). 


- If a language shows a 2-13 split, it violates the 12>3 part of (3), no matter 
which direction the split takes (cf. our earlier discussion of this issue). 


— Where a language exhibits different kinds of person splits for singular and 
non-singular number, each of them was first evaluated separately according 
to the above criteria, and the values were subsequently combined into a sin- 
gle one. If a system showed a fit in one number category and a partial fit in 
the other, we coded it as fit; if a system showed a fit and a violation in the 
other (e.g. Tera and Tsamai), we coded it as partial fit; if a system showed 
a partial fit in one number category and no split in the other (e.g. Shughni), 
we also counted it as a partial fit. All other combinations containing some 
violation were counted as violating systems. 


As a result of this coding policy, we obtained the following raw data for the scale in 
(3) (Table 13). 

These figures suggest a rather strong tendency for both systems and families to fit the 
scale in all macro continents, but in order to control for genealogical relationships and 
areal dependencies in a rigorous way, a mixed-effects generalized linear model (GLM) is 
called for. We thus applied a mixed Poisson GLM (also known as mixed loglinear model) 
to the data at hand. To this end, the data were first cross-tabulated into the format shown 
in Table 14 (the full dataset is available as supplementary material SM1).! 

The results of loglinear modelling show that there is no interaction between the fixed 
effects of Fit and MContinent (p = 0.637): In all areas, there is a strong preference for 
fitting systems even when genealogical relations are controlled for: b = 1.43, p < 0.0001 (cf. 
SM3 for further details). The estimates in a Poisson model represent the multiplicative 
effect of a variable on the outcome on the log scale, which means that "fit" is about 
e143 ~ 4.2 times more probable than “violation”.*? 

In short, the type model suggests that there is a strong cross-linguistic tendency for 
languages to fit the referential scale in (3), independently of macro-continental affilia- 
tions. Since the counts were aggregated across language families, the observed cross- 
linguistic bias towards fitting the scale cannot be attributed to the possible impact of 
larger families, either. 


lFor reasons of simplicity, we discarded the two “partial” languages in Table 13 (viz. Tera and Tsamai, both 
Afro-Asiatic). 

? An alternative to the above loglinear format is to treat the number of fitting and violating systems as 
successes and failures in trials within a family, similar to heads or tails when one tosses a coin (where 
each new language produces either heads or tails). It would then be appropriate to apply logistic binomial 
regression. We tested whether MContinent had a significant influence on the chances of fits as compared 
to violations within each family. Because of some amount of overdispersion, a quasibinomial GLM was 
used. This procedure yielded the same result as the one presented above. There is no significant effect 
of MContinent on the chances of fitting or violation. A model with the intercept only has a significant 
intercept b = 1.44, p < 0.0001, which means that the odds of fitting are ell ~ 42 times higher than those 
of violation. This result is almost identical to the one presented above. The two modelling approaches thus 
converge, which is reassuring. 
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Table 13: Systems fitting or violating the scale in (3) 


Eurasia Africa Americas Sahul 
fit 56 (11) 9 (8) 14 (9) 31 (8) 
vio 14 (1) 2 (2) 5 (5) 5 (3) 
partial 0 (0) 2 (2) 0 (0) 0 (0) 


Table 14: Data coding for the Poisson GLM (segment) 


Family MContinent Fit Freq 
Adamawa-Ubangi Africa fit 1 
Adamawa-Ubangi Africa vio 0 
Benue-Congo Africa fit 2 
Benue-Congo Africa vio 0 
Chadic Africa fit 1 
Chadic Africa vio 0 


3.2 Rank-based modelling 


In this kind of model, it is tested whether higher-ranking P arguments stand a better 
chance of being structurally marked than lower-ranking ones. More precisely, we are 
probing an ordinal relationship by which the odds for marked P arguments should de- 
crease as we proceed down the ranks on the scale (i.e. 1% rank > 24 rank > 31d rank, 
etc.). In order to run an appropriate model, the data were converted into the following 
long format (Table 15). 


Table 15: Data coding for the rank-based GLM (segment) 


MContinent Family System  RefCat Number Marking Rank 
Africa Adamawa-Ubangi Gbeya 1 SG marked 12 
Africa Adamawa-Ubangi Gbeya 1 NSG marked 12 
Africa Adamawa-Ubangi Gbeya 2 SG marked 12 
Africa Adamawa-Ubangi Gbeya 2 NSG marked 12 
Africa Adamawa-Ubangi Gbeya 3 SG marked 3 
Africa Adamawa-Ubangi Gbeya 3 NSG marked 3 
Africa Adamawa-Ubangi Gbeya  Nhigh SG unmarked ` Ny, 
Africa Adamawa-Ubangi  Gbeya  Nhigh NSG unmarked ` Ny, 
Africa Adamawa-Ubangi Gbeya Noy SG unmarked ` Ny, 
Africa Adamawa-Ubangi Gbeya Nyy NSG unmarked ` Ny, 
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This format represents each system in the data by 10 rows, allowing us to code each 
combination of referential category (cf. 4h column, RefCat) and number (5^ column) 
separately. This way, we can now also take person differences between singular and 
non-singular into account. The full data are available as supplementary material SM2. 

We fitted a mixed-effects logistic GLM to these data. The response variable was Mark- 
ing, with the values “marked” and “unmarked” (sixth column of Table 15). The predictor 
that represented the position of the arguments on the referential scale was called Rank 
(last column). We included Number and MContinent as further fixed effects and tested 
the interactions between the predictors. The individual tendencies of systems and lan- 
guage families to mark more or fewer referential categories (variables System and Fam- 
ily) were encoded as random intercepts. Since System is nested within Family, we are 
dealing with a multilevel hierarchical model. 

The analyses reveal a significant main effect of Rank as well as two significant inter- 
actions between the predictors: one between Rank and Number, and the other between 
Rank and MContinent. In the presence of multiple interactions, it is best to explore the 
results visually. Figure 1 displays the average probabilities of “marked” P arguments in 
the singular and the non-singular on the vertical axis. The horizontal axis represents the 
four ranks on the scale, from left to right. The different colours and lines correspond to 
the four macro continents, which are explained in the legend. 

In the singular, we observe very little if any difference between the first two positions 
on the scale (12 and 3). Figure 1 thus confirms our earlier observation that the differ- 
ence between speech-act and third-person (singular) pronouns is not very relevant for 
P-marking overall, but also that there are hardly any violations of the predicted effect 
where it occurs. In Africa and Sahul, the most obvious decrease in the chances of P 
being marked is found between the pronouns and the nouns. In contrast, the Ameri- 
cas and Eurasia have a large difference in the probability of marking between all high- 
prominence arguments (pronouns and high-prominence nouns) and low-prominence 
nouns. 

In the non-singular, Figure 1 nicely reflects what we saw in Table 9 above: In the 
Americas (specifically, Osage) and particularly in Eurasia, there is a certain number of 
languages that violate the 12>3 part of the scale, leading to a slight positive rather than 
the expected negative slope of the relevant curves in Figure 1. We saw above that these 
exceptions are virtually all located in Iranian languages, and their effect is not strong 
enough to yield significant counterevidence (post-hoc tests of P-marking: Eurasia: b = 
0.22, p = 0.284, Americas: b = 0.13, p = 0.767). By contrast, all other ranking effects in 
Figure 1 are negative and significant (cf. SM3 for further technical details of the model). 

In sum, what we can take from this model is the following: 


e We do not find any significant violations of the referential hierarchy in (3). 


We also tested models in which we additionally allowed for the rank effect to vary between the families 
in the sample, i.e. by adding random by-family slopes. Where such models were feasible given the present 
sample size per family, they did not make a significant contribution to the model (and were hence discarded 
in the stepwise modelling process), nor did they affect the stability of the rank effect. 
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Figure 1: Influence of ranks on the probability of marked Ps in the singular (left) 
and the non-singular (left), by macro continent. 


e Singular and non-singular number behave slightly differently with regard to the 
effects of the 12>3 subscale (the effect is largely irrelevant in the singular and mixed 
though not significantly contradictory in the plural). However, as there is also a 
significant main effect of Rank in the data, the hierarchy in (3) is robust enough 
across the number categories as well. 


* The macro continents behave differently with regard to the average cut-off point 
that is most relevant on the hierarchy. 


There are thus evidently areal patterns and restrictions in DOM, but the predicted 
effect of the referential hierarchy in (3) is uniform enough in our model to assume that 
it is universally valid, after all. Therefore, while BWZ argue "against universal effects of 
referential scales on case marking" (cf. the title of their paper), we would argue for the 
universality of precisely the effect, no matter which particular dimensions of referential 
prominence are the most relevant ones in individual languages or macro areas. This and 
further issues of interpretation deserve more elaborate discussion, provided in the next 
section. 


4 Interpreting the data 


In assessing the alleged universality of scale effects on case marking, a number of funda- 
mental questions arise that will influence one's conclusion on the matter. In the follow- 
ing subsections, we are going to discuss a selection of these, notably assumptions about 
methodological choices, geographical distributions, counterexamples, and the ontologi- 
cal status of scales, i.e. what they represent and what they are supposed to do. 
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4.1 Methodological approaches to typological data 


At first glance, the most striking difference between BWZ's approach to modelling the 
data and ours is that the former is framed in terms of what Greenberg (1978) calls "the 
dynamicization of typology": As BWZ (p. 24) put it, "any evaluation" of alleged univer- 
sal pressures "needs to target trends in diachrony rather than current distributions". The 
Family Bias Method attempts to model such diachronic trends by investigating whether 
genealogical taxa tend to develop in keeping with the alleged universal (here: in the direc- 
tion predicted by a given referential scale) or not. Other dynamic approaches are based 
on estimating and comparing transition probabilities from the genealogical structure (i.e. 
family trees) of individual taxa (cf. , e.g. , Cysouw 2011; Dunn et al. 2011; Bickel, Witzlack- 
Makarevich, Choudhary, et al. 2015). All of these dynamic methods are, of course, promis- 
ing developments in linguistic typology. But it should be borne in mind that they are not 
based on diachronic data, but on particular inferences drawn from synchronic distribu- 
tions and/or genealogical relations. And such inferences, in turn, usually involve delicate 
decisions on uncertain issues, such as the branch lengths in family trees, the threshold 
for defining diachronic biases or the way in which one extrapolates from large to small 
families. 

Again, we do not wish to call these methods into question, but in the absence of 
world-wide data on actual diachronic developments, we believe that densely sampled 
synchronic data are still a viable, legitimate and powerful source of evidence in linguistic 
typology. Instead of throwing out the synchronic baby with its bathwater, then, we have 
here followed equally recent methodological proposals by Cysouw (2010) and Jaeger et 
al. (2011) to model synchronic distributions by means of mixed-effects regression pro- 
cedures. These are standard ways of modelling variation in other disciplines, and while 
they cannot, by definition, target any diachronic trends, they are powerful means of 
staking out the room for universal pressures once family- and area-internal variation 
is controlled for. In fact, just like the Family Bias Method, they examine the number 
of “fits” and “violations” taxon by taxon (cf. Table 14 again). The difference is that our 
models end up taking all taxa in the data on board (including those that the Family Bias 
Method would have excluded as “internally diverse") and that they always operate with 
the actual values of all isolates rather than estimating them based on extrapolation pro- 
cedures. What we can obtain from this is a classic Greenbergian statement that ^with 
overwhelmingly greater than chance frequency" (e.g. Greenberg 1966 [1963]: 79), sys- 
tems of differential case marking tend to obey the referential hierarchy in (3) rather 
than going against it. 

Ultimately, then, it is fair to say that, at the current stage of research, synchronic and 
diachronic methods of modelling typological data have complementary advantages and 
drawbacks. And as long as that is the case, we see no reason to trust carefully sampled 
and analyzed synchronic data any less than diachronic inferences drawn from them. 
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4.2 Geographical universality 


A common assumption since at least Bossong (1985) has been that differential argu- 
ment marking, and its systematic correlation with referential categories, is “extremely 
widespread” (Aissen 2003: 439) and independent of macro-areal affiliations. 

In all fairness, these claims refer to differential P-marking only, and BWZ's data sug- 
gest that they would, indeed, be plainly wrong for A-marking. Although we do not know 
the principles according to which BWZ selected their sample languages, it seems safe 
to say that differential A-marking is generally dispreferred and its occurrence is skewed 
heavily towards Eurasia and Sahul, and here again towards Indo-European, Pama-Nyun- 
gan and perhaps Sino-Tibetan. For differential P-marking, on the other hand, the picture 
is less clear. The two largest distributional studies, namely BWZ and Sinnemäki (2014), 
appear to yield Ssomewhat different results, which we set out and discuss for interested 
readers in the supplementary materials (SM4); from the facts presented there, it seems 
to us that when languages develop case marking for direct objects, the differential mark- 
ing type is indeed more likely, across the world's linguistic macro areas, than unsplit 
marking. 

But the overall distribution of DOM is actually less vital than another point raised in 
Sinnemáki (2014): He argues that the individual referential dimensions underlying DOM 
exhibit conspicuous areal contours. While animacy is distributed fairly evenly across 
the globe, definiteness/specificity shows a strong skewing towards Africa and the Old 
World more generally. In our model, too, we found some significant areal differences in 
the preferred cut-off points on the hierarchy in (3). However, we opine that such areal 
skewings do not invalidate the basic insight of the referential scales in (1) and in (3). As 
far as we can see, all versions of referential scales proposed in the typological literature 
are intended to be cross-linguistic generalizations over referentially-conditioned splits 
in individual languages, no matter which of the referential categories on a given scale 
are actually relevant in those languages. In other words, the hierarchy aims to capture a 
language with a particular person split in the pronouns just as much as a language with 
an animacy split among full NPs. Therefore, the requirement for the universality of scale 
effects is not that each individual subscale or referential dimension needs to be attested 
throughout the world, but that wherever referentially-conditioned splits do occur, they 
will strongly tend to obey the referential hierarchy rather than going in the opposite 
direction. Crucially, this latter issue is not addressed in Sinnemáki's (2014) paper: He 
asks which referential (or other structural) dimensions are responsible for differential 
object marking in the sample languages and how these dimensions are distributed geo- 
graphically. He does not, however, look at the directionality of the effect, i.e. whether a 
language that has an animacy split actually works in the predicted direction. To the ex- 
tent that these effects are uniform (cf. 84.3 below), we do not see any reason to question 
the validity of referential scale effects on purely geographical grounds. 
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4.3 Structural universality 


Bossong (1985: VIII) voices a common opinion among comparative linguists when he 
claims that the patterns of differential object marking are "structurally uniform [...] 
around the earth" (our translation), in the sense that whenever DOM is driven by ref- 
erential properties, it follows the direction given in (2) above rather than going against 
it. BWZ extend this assumption to differential A-marking as well and ask whether "there 
exists one or more universal scale(s) on which all [split] systems fit" (p. 22), and we al- 
ready know that their conclusion is negative. 

There are two issues involved here. The first and more important one pertains to the 
number of weight of counterexamples. The figures above suggest that splits in terms of 
animacy, definiteness and other high/low-contrasts are almost without exception, for 
both A- and P-marking (Table 3 and Table 5 above). The same holds when languages 
make "clean" splits between nouns and pronouns (Table 4). From this perspective, the 
lower end of the traditional referential hierarchy, as well as its global ranking of pro- 
nouns and nouns, can be considered structurally uniform, indeed (cf. also Levshina 2018 
for further statistical corroboration). What is more problematic is the internal ranking 
of person and number distinctions, i.e. particularly the upper part of the referential hier- 
archy. Here, Table 7- Table 12 suggest considerable language-specific variation and thus 
idiosyncratic historical developments (cf. also Filimonova 2005 on this point). There- 
fore, when BWZ test for scales involving particular pronominal splits (e.g. 1>2>3>N or 
12>3>N), and with the cross-cutting number distinctions being disregarded, it is not sur- 
prising that they find a number of exceptions; in fact, they even find roughly as many 
family biases in favour of and against these scales (cf. Appendix). By contrast, our alter- 
native regression analysis of the 12>3>Nhigh>Njow scale still showed a robust enough 
effect for this particular person split (in both the type model and the rank model), even 
when number is taken into account as a separate variable. Taken together, our analy- 
ses suggest that referential effects on case marking are sufficiently homogeneous to be 
considered universal, at least by typologists who (unlike Bickel et al.) accept purely syn- 
chronic evidence as a valid basis for establishing universals. 

A second point about structural homogeneity relates to BWZ's finding (p. 34) that 
no single scale they tested fits A and P simultaneously. As with Sinnemáki's (2014) ar- 
gument about the areal restrictedness of animacy or definiteness, one may object here 
that it actually does not matter whether the high-low distinction is less important for 
A-marking than for P-marking. In fact, it has recently been emphasized that A and P are 
not simply “each other's mirror-image” (Fauconnier & Verstraete 2014) in a number of 
ways, and hence also differ in regard to the referential properties that are relevant when 
they are case-split. It may thus very well be the case that the referential hierarchies in 
(1) are poorer predictors for A- than for P-marking because they miss some of the cru- 
cial dimensions (e.g. particular kinds of focus) and overstate others (e.g. animacy and 
definiteness). However, to the extent that they are applicable, it is again the predicted 
effects that are at stake here. And as we saw above, the effect is strikingly homogeneous 
as far as high-low distinctions and the clean Pro-N splits are concerned. Where A and 
P may respond very differently is the referential dimension of number, as was shown 
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in Table 10-Table 12, so that we see opposing rather than uniform effects of the alleged 
SG > PL scale. This is certainly worth further investigation, but given that most versions 
of the referential hierarchy are not even concerned with number contrasts, we do not 
see this as a serious challenge to referential scale effects in general. 


4.4 The status and purpose of referential scales 


In this final section, we would like to comment on two remarks by BWZ on the useful- 
ness of scales in typological research. The first one relates to the fact that by far most 
languages work in terms of a specific binary opposition, which is why BWZ explicitly 
reject the terms “scale” or “hierarchy” to capture such simple splits. In our view, this 
issue is largely terminological in nature: In so far as binary oppositions (like Pro » N) 
are implicational statements as we see them in other typological domains (e.g. like sc > 
PL Or VOICED PLOSIVES > VOICELESS PLOSIVES), we are not averse to calling them “(impli- 
cational) scales" or “(implicational) hierarchies”. The more important issue is the second 
one, relating to the level of abstraction at which comparative scales are formulated. Re- 
call that BWZ find positive evidence for their Pro/Npigh > Njow scale, but they question 
the usefulness of such a scale precisely because it seems too heterogeneous to reflect 
a single underlying principle (p. 36 of their paper). The same kind of criticism may ac- 
tually be levelled against the extended hierarchies in (1a) and (3), which also conflate a 
number of logically distinct dimensions (e.g. a person contrast within the pronouns, a 
split in nominality and various other properties). The question is, therefore, to what ex- 
tent the postulation of more abstract (i.e. extended, multidimensional or more general) 
hierarchies is justified. 

Generally speaking, the motivation behind postulating referential scales is to capture 
constraints on cross-linguistic variation. Mapping diverse language-specific oppositions 
onto more abstract comparative scales firstly serves the purpose of increasing the scope 
of the constraint; as compared to individual scales, it is thus arguably a more elegant 
way of formulating cross-linguistic generalizations. It does, however, also suggest that 
there is a unified explanation for the phenomenon in question. Gildea & Zúñiga (2016), 
for example, note that the referential hierarchy has often been taken to reflect a coher- 
ent cognitive phenomenon, a “representational constraint" in the sense of Haspelmath 
(submitted) or Elman et al. (1996). For example, Kiparsky (2008: 39-40) characterizes his 
version of the referential hierarchy as an “inviolable [...] part of the design of language”, 
i.e. of "U[niversal] G[rammar]”. In so far as such representational principles directly con- 
strain the possible shapes of case-marking systems, the postulated hierarchy is said to 
explain the cross-linguistic patterns we observe.” 


“Exceptions to this are languages that make a certain kind of split in the pronouns (e.g. 1273) and a different 
one in the nouns (e.g. Nnigh>Niow, cf. Table 7 above), or languages that use multiple cases or different case 
allomorphs differentially, depending on referential properties. 

55A formal account of a very different kind is presented in Aissen (2003), but the conclusion ultimately also 
reads like an UG-based representational constraint: “[T]he principles underlying DOM" may be “part of 
core grammar”, implemented by a “universally fixed [...] ranking of constraints" (Aissen 2003: 439-440). 
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In functional-typological work, referential hierarchies are not inviolable “top-down” 
principles of cognition; the correlations they capture (i.e. between an argument’s ref- 
erential prominence and its likelihood of receiving special case marking) are typically 
given more probabilistic explanations in terms of language usage and change.!° Now, if 
one believes that these correlations fall out entirely from local processes of grammatical- 
ization and can be fully explained by reference to the respective source construction (e.g. 
Cristofaro 2013), there is really no gain in postulating an extended or more abstract hier- 
archy beyond individual referential dimensions. By contrast, for typologists who argue 
that these individual dimensions can receive a unified explanation, such an abstraction 
is more useful. Perhaps the best-known line of argumentation in this direction is that of 
communicative efficiency (e.g. Dixon 1979; Comrie 1981; Newmeyer 2005; Haspelmath 
2008; Hawkins 2014): Speakers tend to mark those A and P arguments whose syntactic 
function is relatively unexpected (or surprising) given their referential properties, while 
expected role-reference constellations are left unmarked (cf. also Haspelmath 2018 for a 
systematization of this proposal). Crucially, this account is said to work for all kinds of 
referential splits in the same way, whether they are based on animacy, definiteness or 
other kinds of prominence in particular languages. While still in need of further corrob- 
oration, there is mounting evidence from frequency data (e.g. Dahl 2000; Fry 2003; Lee 
2006; Jager 2007), psycholinguistic experimentation (e.g. Fedzechkina et al. 2012; Kuru- 
mada & Jaeger 2015) and computer simulations (e.g. Lestrade, this volume) in favour of 
this approach, at least for DOM (cf. also Levshina 2018). 

In sum, then, the postulation of more abstract or multidimensional referential hier- 
archies is not just an elegant way of formulating cross-linguistic generalizations about 
case splits. It is also useful if one believes that a unified explanation can be given to 
those splits. With regard to the latter, we currently see little, if any, evidence for an in- 
nate, inviolable referential hierarchy in Kiparsky's sense, but accumulating evidence in 
favour of functional explanations that operate with probabilistic constraints on usage 
and diachronic change.* 


lóThere are, of course, also attempts in the typological literature to link implicational universals and semantic 
maps to “conceptual spaces”, i.e. coherent “regions” of the human mind (cf. Croft 2003). But this sort of 
cognitive interpretation does not seem to be prominent for the referential hierarchy. For a general critique 
of this approach, see Cristofaro (2010). 

17 As we saw earlier, differential A-marking is generally rarer, geographically and genealogically more re- 
stricted, and no parallel evidence from psycholinguistic experimentation is currently available. Moreover, 
there is compelling evidence that differential A-marking involves additional motivations that do not ap- 
ply to P-marking in the same way (de Hoop & Malchukov 2008; Fauconnier & Verstraete 2014). For these 
reasons, it is presently rather difficult to estimate just how much of differential A-marking is amenable an 
account in terms of communicative efficiency. 

18A reviewer of the paper remarked that this formulation, and the efficiency explanation in general, is basi- 
cally diachronic in nature, which s/he sees as a contradiction to the kind of synchronic typology we have 
practised here. But these are actually two independent issues. Efficiency explanations are first and foremost 
about the choices, however subconscious, that individual speakers make for or against overt case marking 
in online production (and hence “synchronically”, in a sense); these necessarily have to propagate in time 
and space to conventionalize into a grammatical pattern, which adds a diachronic component to the ex- 
planation. But since we cannot sample these processes in the same way that we can sample their results 
across the world's languages, we believe that the synchronic states that we have investigated here are still 
a viable data source for typologists. This is hence a purely methodological point and does not contradict 
the fact that usage-based explanations involve diachrony. 
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5 Conclusion 


In this paper, we have attempted to re-present and reanalyze BWZ's typological data on 
differential case marking. Their database, along with Sinnemáki's (2014), constitutes the 
largest current repository for gauging case-marking patterns in the world's languages, 
and we would thus like to acknowledge again the tremendous amount of cross-linguistic 
groundwork that these colleagues have carried out. Moreover, Bickel's (2011; 2013) Fam- 
ily Bias Method is a valuable addition to the toolkit of quantitative typology, as it starts 
out from considering how possibly universal pressures on language should play out in 
the diachronic development of families. It is thus conceptually different from the kinds 
of regression models that we have used in the present paper, although it operates with 
exactly the same kind of synchronic typological data. The most important technical dif- 
ference is that its final results are based on statistically significant biases in large families 
and their extrapolation to small taxa and isolates; it thus neglects large families without 
biases and introduces some noise into the data from small taxa (cf. Appendix again). The 
major goal of the present paper was to complement these Family Bias estimations with 
a look at the actual “raw” data on various referential dimensions and to present an alter- 
native statistical model of the data that relies on widely used regression procedures on 
the full data set. 

In doing so, we found less counterevidence than BWZ's results and their rhetoric sug- 
gest. The global structure of the classic hierarchies (pronouns » nouns) and all high-low 
prominence distinctions (animacy, definiteness, topicality, kinship) are almost without 
exception, and while there is more variation within the pronominal domain, a closer look 
at the data reveals that the number of counterexamples is not significant enough to over- 
ride the strong support that the referential hierarchy in (3) receives from our statistical 
models. 

Therefore, our conclusion is the opposite of BWZ, namely that there is evidence for 
universal scale effects on case marking. We can subscribe to this view for the following 
reasons: 


e Unlike BWZ, we accept purely synchronic evidence for postulating universal pref- 
erences (provided it is as statistically robust as in the present case). 


e Unlike Sinnemäki (2014), we do not require that the individual referential prop- 
erties need to be involved in DAM in all macro areas to the same degree; what 
matters is that the direction of the effect is uniform, regardless of which specific 
referential dimensions it comes from. 


* Unlike BWZ, we obtain a positive statistical signal even when several referential 
dimensions are combined into a larger scale. 


* Unlike BWZ, we have no reservations to apply the label “scale” even to binary 
oppositions (which is how most languages work to begin with). That is, even if we 
did not wish to operate with extended scales such as (1) or (3), we would argue for 
the existence of “scale effects”. 
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As laid out in 84, we believe that working with multi-term or abstract scales can be 
useful if one has an explanatory account that unites the various referential dimensions 
under a single principle. While we reject the view that such a referential hierarchy con- 
stitutes an innate representational constraint, we are sympathetic to a functional view 
that relates different referential contrasts to a common principle of efficient information 
processing. 
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Appendix: Bickel, Witzlack-Makarevich & Zakharko's 
(2015) Family Bias estimations 


In this appendix, we provide some of Bickel, Witzlack-Makarevich & Zakharko's (2015) 
[henceforth BWZ, as in the main text] results for comparison with our own analysis. 
Readers familiar with the Family Bias Method may thus jump ahead to Table 16 and 
Table 17 below; for uninitiated readers, we first provide some comments on how to in- 
terpret the figures. For a more detailed introduction to the Family Bias Method as such, 
such readers are referred to Bickel (2013). 

The key question that BWZ seek to address is whether a given referential scale shapes 
the diachronic evolution of language families. BWZ take the synchronic internal com- 
position of each family as indicative of such directed diachronic processes: If a family is 
significantly biased (on synchronic grounds) towards fitting a scale rather than in the 
opposite direction, this may be indicative ofthe family having developed in the predicted 
direction, either by continually retaining the fit on each evolutionary trial (i.e. with each 
new daughter language) or by “correcting” a non-fitting case system at the next clado- 
genetic juncture (i.e. with a new daughter language). A universal signal for scale effects 
would then amount to most families in a representative sample being significantly biased 
in the predicted way, again independently of geographical affiliations. 

It is obvious that such biases can only be estimated for sufficiently large families (here: 
N > 5 members). Bickel's method thus extrapolates these estimations to smaller families 
and isolates. As a consequence, the synchronic data for a language isolate are not simply 
taken at face value, but as surviving traces of an erstwhile family that itself may or may 
not have had a principled bias in differential argument marking. In other words, one 
reckons with the possibility that a given isolate can be the survivor of a family with the 
opposite bias, or no bias at all. Depending on how strong and uniform the biases are in 
large families, the method may thus deliberately introduce some “noise” to the data from 
small families and isolates, rather than always taking their actual values as we find them 
in the synchronic data. Because of such "interventions" with the data, the extrapolation 
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process is repeated hundreds or even thousands of times and the average results of all 
estimations are then taken as the final basis for exploring universal trends. 

It is against this background that BWZ's Family Bias estimations need to be inter- 
preted. Therefore, the following things need to be kept in mind when looking at the 
figures below: 


* The numbers always pertain to taxa (i.e. genealogical units) rather than languages. 


* The numbers exclude taxa that have been estimated to be diverse (rather than bi- 
ased), as internally diverse taxa are argued not to yield conclusive evidence for the 
family to be shaped by a given referential scale. 


e The figures contain non-integer numbers, as the extrapolation to small families 
and isolates is repeated many times and averaged over; the results thus display 
the means of several hundreds of runs of bias estimations. 


In Table 16 and Table 17, we present the results of BWZ's type model (cf. our 83.1 for 
comparison). 


* The numbers always pertain to taxa (i.e. genealogical units) rather than languages. 


* The numbers exclude taxa that have been estimated to be diverse (rather than bi- 
ased), as internally diverse taxa are argued not to yield conclusive evidence for the 
family to be shaped by a given referential scale. 


e The figures contain non-integer numbers, as the extrapolation to small families 
and isolates is repeated many times and averaged over; the results thus display 
the means of several hundreds of runs of bias estimations. 


In Table 16 and Table 17, we present the results of BWZ's type model (cf. our 83.1 for 
comparison). 

The first column of Table 16 and Table 17 lists the scales that were tested as possible 
candidates for universal referential hierarchies. As can be seen, each of these scales re- 
quires that the manifold language-specific referential categories (like the 3sc.PRo.NHUM 
category from above) are subsumed under a more general category (like “3” in the first 
scale or ^3/N" in the second). The figures in the remaining columns indicate how many 
taxa (large and small) were estimated to be significantly biased in the direction predicted 
by each scale (“fit”) or against it (“-fit”). As far as we can tell from the raw data, there 
is a total of 80 taxa in BWZ's database that show some kind of P-split, so the figures in 
the last column of Table 2 should be compared against this overall number. For exam- 
ple, out of the 80 taxa, only about 7 show a significant bias towards being driven by the 
SAP > 3/N scale, i.e. where speech-act participants (= SAP or 1,2) behave differently with 
regard to case marking from third-person referents (3/N). Conversely, this means that 
the vast majority of taxa were estimated not to show a significant bias along this scale. 
Crucially, for the 7 taxa that are estimated to be biased, there is no clear signal in favour 
of the proposed scale, as in each of the three macro areas compared here, the number of 
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Table 16: Results of Bickel, Witzlack-Makarevich & Zakharko's (2015: 34) type- 
model analysis of P-splits 


Scale Eurasia Sahul Other N 
+fit -fit «fit -fit «fit. -fit 
1>2>3>N 0.66 0.67 135 104 016 2.87 6.75 
SAP>3/N 0.78 0.53 1.21 1.12 123 2.19 7.06 
SAP>3>N 0.66 0.69 1.32 104 0.35 2.58 6.63 
SAP>3>N-high>N-low 0.34 0.01 0 0 0.03 0.49 0.87 
Pro>N 12.89 1.92 593 0.39 815 2.75 32.04 
Pro/N-high>N-low 8.11 0.08 2.8 0.18 4.55 0.49 16.21 
nsg>sg 0 4.3 0.04 0.62 019 3.86 9 
sg>nsg 2.38 198 0.66 1.7 2.23 1.78 10.73 


Table 17: Results of Bickel, Witzlack-Makarevich € Zakharko's (2015: 34) type- 
model analysis of A-splits 


Scale Eurasia Sahul Other N 
«fit -fit +fit -fit +fit -fit 

1>2>3>N 174 103 0 0 0 0 2.77 
SAP>3/N 149 0 0 0 0 0 1.49 
SAP>3>N 1.51 0 0 0 0 0 1.51 
SAP>3>N-high>N-low 0.32 0.01 0 0 0 0 0.33 
Pro»N 1.51 0 2.29 0.1 0.52 0.47 4.89 
Pro/N-high>N-low 156 0.1 162 0.05 0.002 0.5 3.86 
nsg>sg 105 169 0 0 0 0 2.74 
sg»nsg 0 148 0 1 0 0 2.48 


scale-conforming taxa is counterbalanced by a roughly equal (or even higher) number 
of scale-violating taxa. According to BWZ, then, this provides clear evidence against a 
universal effect of an alleged SAP > 3/N scale, and similar conclusions carry over to most 
other scales they test: The overall number of biased taxa is extremely small in each case, 
and the counterevidence is in the same range as the fitting cases (except for Pro > N and 
for Pro/Nhigh > Niow): 
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Diachrony of differential argument 
marking 


While there are languages that code a particular grammatical role (e.g. subject or direct object) 
in one and the same way across the board, many more languages code the same grammatical 
roles differentially. The variables which condition the differential argument marking (or DAM) 
pertain to various properties of the NP (such as animacy or definiteness) or to event semantics 
or various properties of the clause. While the main line of current research on DAM is mainly 
synchronic the volume tackles the diachronic perspective. The tenet is that the emergence and 
the development of differential marking systems provide a different kind of evidence for the 
understanding of the phenomenon. The present volume consists of 18 chapters and primarily 
brings together diachronic case studies on particular languages or language groups including 
e.g. Finno-Ugric, Sino-Tibetan and Japonic languages. The volume also includes a position paper, 
which provides an overview of the typology of different subtypes of DAM systems, a chapter 
on computer simulation of the emergence of DAM and a chapter devoted to the cross-linguistic 
effects of referential hierarchies on DAM. 
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